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In this Issue 

The Massachusetts Institute of Technology s X Window System, Verston 
1 1 , has become an industry standard window system for supporting user 
interfaces in networks of workstations running under AT&T "s UNIX operating 
system. In Hewlett-Packard terms, this means HP 9000 Series 300 and 800 
workstations running under the HP-UX operating system. The X Window 
System lets an application program running on one workstation display infor- 
mation to a user sitting at any workstation in the network. HP 9000 Series 
300 800 workstations also offer a high-performance 2D and 3D graphics 
library called Starbase. Naturally, users wanted their application programs 
to be able to use the Starbase graphics library and run under the X Window System. Unfortunately, 
they couldn't do both at once. The two systems had been designed independently, and both 
assumed exclusive ownership of the display and input devices. Furthermore, while many X 
applications could be active in the network simultaneously, only one Starbase application could 
run on a workstation. As a result of these differences, the two systems couldn't coexist. Working 
out a solution to this problem required a joint effort of engineers at two HP Divisions, dubbed the 
Starbase X1 1 Merge project. Merging the two systems was a nontnvial technical challenge. It 
had to be done without sacrificing the performance of Starbase applications or requiring that they 
be rewritten. As related in the article on page 6. it required changes to the architecture of both 
systems, development of cooperating display drivers for the two systems, restructuring the interlace 
between the drivers and the X server process, and development of a facility to handle communi- 
cation between the two systems. In other artictes, you'll find details of the changes as they relate 
to the management of graphics resources (page 12), access to display hardware (page 20), use 
of display memory (page 33), sharing of input devices (page 38), and modification of existing test 
suites (page 42). 

The capabilities of the Starbase graphics library include high-performance color rendering and 
3D solids modeling. For determining the intensity of light reflected to the observer's eye from any 
object, the library offers three illumination models — one local and two global. A local model 
considers only the orientation of an object and light from light sources, A global model also 
considers light reflected from or transmitted through other objects in the scene. The two Starbase 
global illumination models are based on methods called ray tracing and radiosity. In the paper 
on page 78, David Burgoon presents the mathematical foundations of the radiosity method and 
compares its capabilities and limitations with those of the ray tracing method. 

The Starbase graphics library runs on HP 9000 Computers equipped with the SRX or TurboSRX 
graphics subsystems. The TurboSRX is an enhanced-perfo nuance version of the SRX design. 
On page 74, Larry Thayer explains how analysis of the data-flow pipeline of the SRX revealed 
where custom VLSI chips coufd be used to improve the performance. He then describes three 
chips that were designed to take advantage of these opportunities for the TurboSRX version. 
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For HP s commercial computer systems based on the HP 3000 Computer, the last resort in 
troubleshooting usually involves analyzing a dump of the computer s memory While powerful 
tools have evolved tor on-fme dump analysts, until recently no parallel progress had occurred 
that would allow efficient on-line examination of operating system source code. After finding clues 
in the memory dump. HP support engineers had to rely on a complex manual process to iocate 
specific source code in a printed listing. Fortunately, this isn t true anymore. HP support facilities 
now have HP Source Reader, a system for accessing source code stored on compact disk 
read-only memory, or CD-ROM. The source code is stored on the CD-ROM in a proprietary format 
and js retrieved by an access program that runs on an HP Vectra Personal Computer and allows 
relevant information to be popped onto the screen in seconds. On page 50. three of the systems 
designers — support engineers themselves — describe HP Source Reader and present an example 
of its use. 

As integrated circuit clock rates and signal transitions have become faster and faster it has 
become necessary to treat even very short wires and printed circuit board traces as transmission 
lines. This means that impedance matching, reflections, and propagation delays are important 
considerations. In automatic testers for such high-speed devices, transmission line techniques 
must be applied to the tester-to-device interconnection if the device is to be tested at operating 
speeds and accurate results are required. The paper on page 58 describes how this interconnection 
is implemented in the HP 82000 JC Evaluation System to ensure high-precision measurements 
even for dtfficult-to-test CMOS devices. A resistive divider arrangement makes it possible to test 
low-output-current devices up to their maximum operating frequencies. 

R.P. Dolan 
Editor 

Cover 

This HP 9000 Series 300 display shows the results obtainable using a StarbaseXH Merge 
system display mode called combined mode. This mode takes advantage of the sophisticated 
rendering capabilities of the TurboSRX 3D graphics accelerator causing the two sets of display 
planes — image and overlay— to be treated as one screen. The complex 3D images were rendered 
in the image plane and the listing, the clock, the buttons, and the plot were rendered in the overlay 
plane. 

What's Ahead 

The HP OSI Express card provides on one HP 9000 Series 800 10 card the capabilities of the 
network architecture defined by the ISO Open Systems Interconnection (OSI) Reference Model. 
In the February issue ; ten articles will provide insight into the OSI Express card implementation 
of the model and will define what sets this implementation apart from other networking implemen- 
tations. Also featured will be the HP 71400A Lightwave Signal Analyzer, which measures the 
characteristics of high-capacity lightwave systems and their components, including single-fre- 
quency or distributed feedback semiconductor lasers and broadband pin photodetectors. An 
accessory, the HP 11 980 A Fiber Optic Interferometer, helps characterize the spectral properties 
of single-frequency lasers. 
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System Design for Compatibility of a 
High-Performance Graphics Library and 
the X Window System 

The StarbasefXH Merge system provides an architecture 
that enables Star base applications andX Window System 
applications to coexist in the same window environment. 

by Kenneth H. Bronsteirv David J. Sweetser, and William R. Yoder 



HFS HIGH-PERFORMANCE 2D and 3D GRAPHICS 
library called .Starbase has proven very successful 
in engineering workstation applications. Similarly, 
The X Window System" 1 Version 11, or Xll, has become 
the do facto industry standard window system for support- 
ing user interfaces on workstations connected across a net- 
work. 1 J Both of these systems run in the HP-UX environ- 
ment on the HP 9000 Series *M)() and HtJCJ Computer systems 
(see boxes on pages 7 and 8). 

Before the Starbase/Xll Merge project, the X Window 
System and Starbase graphics applications were not able 
to run on the same display. An application could use either 
the Starbase high-performance graphics or it could run in 
the X Window System, but not both simultaneously. These 
systems each make simple assumptions about ownership 
of the display and input devices, and this makes them 
unable to coexist. Since HP is one of the industry leaders 
in the X Window System technology and Starbase is a 
widely used graphics library, the Starbase/Xll Merge proj- 
ect was started to design and implement a scheme whereby 
X and Starbase applications could coexist on the same 
display. 

There were three major challenges associated with merg- 
ing Starbase rind Xl1. The first challenge was to change 
the architect tire of I tie Starbase graphics libraries and the 
X Window System so that a Starbase application could run 
within an X window with full functionality and with per- 
formance comparable to Starbase running on a dedicated 
(nonwindowed) display. The second important challenge 
was to enable existing Starbase applications to relink sim- 
ply w r ith the new Starbase drivers and run in an X Window 
System with no modifications to the application's source 
code. The final major challenge was to coordinate the de- 
sign and development of this product over geographical 
and organizational boundaries. The Starbase/Xll Merge 
project was the joint effort of software engineers located at 
HP's Graphics Technology Division (GTUJ in Ft. Collins, 
Colorado, and HP's Corvallis Information Systems Opera- 
tion (CIS) located in Corvallis, Oregon. The team in Col- 
orado was responsible for the Starbase portion of the project 
and the team in Corvallis was responsible for the X Window 
System portion of the project. 

This article and the next five articles in this issue describe 
the design and implementation techniques used to handle 
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Fig. 1 , incompatible architectures, (a) The architecture for 
an X application (h) The architecture for a Starbase applica- 
tion. Both architectures assume complete ownership of the 
display. 
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these challenges- 
Design Alternatives 

The architectures for a client (application) running in 
the X environment and an application using Starbase are 
shown in Fig. 1. The X Window System is network trans- 
parent, which means that an application running on one 
workstation can display itself to a user sitting at the same 
workstation or at another system across a network. Appli- 
cations, or clients, running in the X Window System are 
allowed access to the display only through the X server, 
which is a separate process that arbitrates resource conflicts 
and provides display, keyboard, and mouse sendees to all 
applications accessing the display. Also, as shown in Fig. 
I. many X applications can be served by the X server simul- 
taneously, Starbase, on the other hand, is a collection of 
libraries and drivers for 2D and 3D graphics applications, 
and only one Starbase application can run on the worksta- 
tion at a time, 

In trying to merge Starbase and X,* we did not lack 
alternative solutions. During the investigation stage there 
was little doubt that we could change the architectures of 
Starbase and X to coexist, but how to merge the two was 
not clear. The design alternatives Included: 

■ Following the existing HP Windows/9000 model of add- 
ing window management utilities to the Starbase li- 
braries. 

* Implementing the X server on top of the Starbase 
graphics libraries. 

■ Implementing the X server using an internal low-level 
Starbase interface. 

m Implementing an X driver for Starbase, using X Window 
System Xlib calk. 

■ Writing an X extension thai implements Slarbase low- 
level semantics. 

■ Developing Starbase and X drivers that cooperate in ac- 
cessing (be display hardware. 

The project team selected the last alternative- This ap 
p roach resulted in creating low- level drivers to support the 
rendering requirements of both Starbase applications and 
the X server, the restructuring pi the serve] Interface be- 
tween the low-level drivers and die device-independent 
portion of the X server, and the development of a facility 
to handle communication between X and Starbase. 

Low- Level Driver Redesign 

The Graphics Technology Division manufactures a van- 
el y ul display types with the following characteristics: 

■ On-screen resolutions that range tram 512 by 400 pixels 
to 1280 hy 1024 pixels. 

■ Display planes lhal range from 1 (capable of displaying 
black aod white) to ^4 [capable of displaying any of Ih 
million colors, with every available pixel a different 
color], 

■ Advanced hardware features, such as 2D and 3D graph ii s 
accelerators. Graphics accelerators provide graphics op- 
erations such as polygon clipping, rotation, and other 
transformations implemented in high-speed hardware. 
To put the responsibility w r here the expertise lay and to 

The x Window System & a trademark oi the Massachusetts tnsMuta of Technology 
*ln this and othtir articles Xi i end • Window SyBiem win also simply I • as X 



The Starbase Graphics Package 



Starbase is 3 'ibrar. 

ised m 1985, based on a draft of the ANSI ana 
andard Computer oi CGi Sh 

first release, feature: ^ded to Starbase that go 

beyond the CGI standard The library mciuov 
■ 

es thai read locations or button and key presses fror 

. lhai echo the posn out device 

on an arbitrary a is play. 

An important goal of the Starbase product is toprovrde a library 
of functions mat can be used on a range of devices. Starbase 
conceals the details of device dependencies, allowing each pro- 
gram to be used with a growing list of devices without making 
changes to the program The current Starbase products support 
over 20 different devices They include workstation displays, plol- 
■■••rminals, mice, and data tablets New devices can be used 
as they become available by linking a program with new device 
drivers This device independence is also used to as5. 
development of other graphics libraries. Implementations of li- 
braries for the ANSI standards Core Graphics System (CORF i 
7,s Kernel System (GKS), and Programmers Hierarchical 
Interactive Graphics System (PHIGS) use the Starbase device 
drivers to support the same range of devices 

The device independence of Starbase coexssts with access 
to the full features and maximum performance of each device 
that it works with, Common features, such as line and polygon 
drawing, are supported directJy on capable devices and emu- 
lated on simpler devices. The more sophisticated features of 
advanced displays, such as shaded images, are available to 
programmers that require these features, but nol emulated on 
sirnpler devices. 

Starbase has features tuned to the needs of particular groups 
of customers. Some additions optimize strictly two-dimensional 
graphics, such as for printed circuit layout electrical design 
and drafting. Functions have been added to Starbase to support 
integer coordinates and transformations that allow taster, more 
cost-effective display systems for these applications Other ad- 
ditions emphasize three-dimensional images such as used for 
advanced mechanical design Starbase supports perspective 
views of objects with shading simulating light sources, and draws 
only those parts of an image that are not hidden behind solid 
objects. The most recent additions to Starbase provide photo- 
realism, the appearance of near reality, through ray tracing and 
radiosity technologies See the article on page 78 tor more infor- 
mation about radiosity 



em commodate all these display types, the engineers at GTD 
implemented (he new display drivers, and the engineers 
m LS i mp lamented t he cod e to translate X server semantics 
into display driver formats. The interface hi 'tween the dis- 
play drivers and X was called Ihe X driver interface, or 
XDL XDI is discussed later in this article, 

During the design investigation phases, we discoverer! 
that many requirements of the Starbase environment and 
the X servrr ^nviroimient were similar and Ihe basic al- 
gorithms thai ii ' Ihe hardware were Ihe same. This led to 
the concept nf shared d rivers between the X server and 
base applir atiuns. Originally we hoped that the drivers 
COUld he shared at the object code level, I hat is. the drivers 
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The X Window System 



The X Window System, commonly referred to as X, is an indus- 
try standard, network transparent window system X presents to 
the user a hierarchy of resizable overlapping windows proving 
device independent graphics A graphical user interface is com- 
monly included as an integral part of the X window system. The 
X Window System definition is maintained by the Massachusetts 
Institute of Technology X Consort um 

The first implementations of X were developed jointly al MIT 
by Project Athena and the Laboratory (or Computer Science 
Project Athena was faced with the problem of writing software 
for hundreds of displays from different vendors on machines all 
connected by a local area network They designed X, based on 
the W window system, which was the work of Paul Asente, Brran 
Reid, and Chris Kent of Stanford University and Digital Equipment 
Corp, 

The 1986 MIT release of X, Version 10.4, was The first version 
with multivendor support. HP was among the first computer man- 
ufacturers world w/de to sell X as a product when in March, "1987, 
the company began bh ppiftg trie X Window System for HP-UX 
In January 1988 ihe MIT X Consortium was formed, with HP 
being one of the founding members X Consortium members 
include Apple Computer Inc.. Ardent Computer, American Tele- 
phone and Telegraph Inc., Calcomp Inc , Control Data Corpora- 
tion, Digital Equipment Corporation, Data General Corporation, 
Fujitsu Microelectronics Inc . Hewlett-Packard, International Bus- 
rness Machines Corporation. Eastman Kodak Corporation, NCR 
Corporation, Nippon Electric Corporation. Prune Computer Inc . 
Silicon Graphics, Sun Microsystems trie,, Tektronix Inc.. Texas 
Instruments Inc., Unisys, Wang Laboratories Inc., Xerox Corpo- 
ration, and others 

The X Window System designers. Robert Scheiffler of MIT and 
Jim Getlys of Digital Equipment Corporation, adopted a set of 
critical design objectives, specifying that the window system 
must: 

■ Work on a wide variety of hardware platforms and displays 

■ Facilitate implementation of device pndependent applications 

■ Be network Iransparent 

■ Allow for application concurrency 

■ Support differing application and management interfaces 

■ Pfovjde overlapping windows and output to obscured reg oris 
of windows 

• Support a hierarchy of resizabfe windows 

■ Provide support for text, 2D graphics, and imaging 

■ Be extensible. 



Their implementation of this design has gone through a number 

of revisions The implementation has stabilized at X version 11. 
which has been adopted as an ndwstfy standard The current 

rds bodies that have adopted some portion of X or 
the process of adopting X include ANSI, IEEE, ISO (International 

Standards Organization), NtST (National Institute of Standards 
and Technology). OSF (Open Software Foundation), and X; 
OPEN. MIT has facilitated ihe acceptance of X as a standard by 
disinbuting ihe standards definition documents and the source 
code of sample implementations for puol ie use for a nominal fee. 

The X Window System consists of the X server, the standard 
X library, various library toolkits, and a set of X client applications 
a The X server controls access to display hardware and input 

devices. 

■ The X library is the basic programmatic interface providing a 
standard method to manipulate windows, control input, handle 
window system events, provide text output, manipulate color 
maps, render 2D device coordinate graphics, and extend Ihe 
client/server protocol. 

■ The X toolkits provide standard sets of widgets, menus, and 
other user interface objects The toolkits facilitate the develop- 
ment of applications that have a consistent, easy to use, 
graphical user interface. 

■ A window manager is provided as a special X application 
The functionality of the window manager has been separated 
from the lower-level X server and X library This modular design 
has allowed different window managers and different user inter- 
face models to be incorporated in any users X environment 
The X server and the X library communicate via an asynchron- 
ous stream-based interprocess communication protocol This 
protocol separates the application interface from the X server 
implementation. The X server can [hen be ported to new display 
devices without the need to modify the application programs 
Executable application code compatibility is maintained across 
displays. This network protocol also provides the basss of network 
transparency and interoperability Metwork transparency means 
that an application running on one computer can perform all 
display and input operations for a user sitting either at the same 
system or at another computer across ihe network. Network trans- 
parency is provided at no cost to the application as part of ihe 
standard X implementation, Interoperability .mones that network 
transparency is preserved across various computer vendors' 
products 



could be compiled once and linked into both the X server 
and Star/base programs. The data structure environments 
and some of the rendering semantics of the two environ- 
ments were too different to allow this, so the less restrictive 
alternative of shared source code with conditional compi- 
le t inn was chosen. This scheme enabled us to avoid chang- 
ing existing Starbase library code and duplicating low-level 
display control and rendering operations for different dis- 
play types. 

Restructuring the X Server 

A sample implementation of the X server exists in the 



public domain and tS available from the Massachusetts 
Institute of Technology (MIT)- This sample implementation 
has contributed greatly to the success of the X Window 
System. The X server maintained by MIT provides X ven- 
dors with a source code template from which X server 
products can be developed. Starting with MIT's sample X 
server, vendors can develop a version of the X server that 
works on their hardware. The sample X server consists of 
three major sections: 

■ Device Independent X (DIX), High-level device indepen- 
dent code for handling cursors , events, extensions, fonts, 
and rendering requests. 
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Fig. 2. The modules tn the X server The device dependent 

module (DDK) shows the modifications made to accommo- 
date We needs of the Starbase/Xl 1 Merge system 

Operating System Dependent Interface. This section con- 
tains utilities used primarily by DIX to perforin tasks 
specific; to the host operating system. For example, DIX 
makes no assumptions about the structure of the host's 
file system or about how to open communication chan- 
nels — these details are handled by the code in this sec- 
tion. 

Device Dependent X (DDX), DDX contains the code that 
performs device dependent 10. For example, when a 
client asks the X server to draw a circle or to display 
text, DIX code Interprets the request and passes it to Ihe 
appropriate procedure in DUX for proper device depen 
dent I/O. Conversely, when the user movns the mouse 
or types on the keyboard, DDX conveys this information 
to DIX for processing. DIX passes the mlunnatinn hm k 
to interested clients, 



To handle our needs, the DEJ vas split into two 

more E avers: a translation module and the X display drivers 
(see Fig, 2)* The translation module, which was written by 
the bj . in Corvallis, translates the data formats and 

requests from DIX into a form suitable for the X display 
drivers. The X display drivers, which were written by the 
engineers at GTD in Colorado, do the rendering to a particu- 
lar display. Between these two layers is the X driver inter- 
face (XDI). The X driver interface contains about four dozen 
driver entry points, the corresponding data structures, and 
rict protocol for accessing the entry points. 

This organization of DDX provided two benefits. First, 
it enabled us to carry on development at two separate loca- 
tions and organizations, and second, it helped to eliminate 
redundant Starbase and X display driver code develop- 
ment. The functions provided by XDI include: 

■ Driver and device control 

■ Color map manipulations 

■ Accelerated graphics window support 

■ Cursor, raster, filling, vector* and text operations* 

The translation module, which translates rendering re- 
quests from MIX into a format appropriate for the low-level 
X display drivers, can be very simple as in the following 
DDX-to-XDI routine which handles the DIX request to fill 
horizontal rows of pixels. 

void 

FillSpans(pDrawable, pGC ; nlnst, pptlnit. pwidthfnit, fSorted) 



DrawablePtr 


pOrawable: 


* pointer to drawing surface * 


GCPtr 


pGC: 


* pointer to the graphics context ' 


int 


nlnit: 


"number of spans to fill *i 


DDXPointPtr 


pptlnit; 


;i pointer to list of start poinis ■ 


int 


T pwidthlnit: 


f pointer to list of n widths * 


int 


f Sorted; 


F Ignored 7 


DECLARE.XDL 


POINTERS 


"set up data pointers " 


GET_XDJ_INFO 




t* gel information V 


PREPARE_TO_ 


RENDER 


. *set up display hardware *■ 
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/" FillScanline is an XDI routine that accomplishes the 
fill request " 

( * (pxdiGCJumpTable— > FillScanline})(pxdiRender, 
(pxdiDrawabie,pGc\ 

nlnit, r number of spans to fill * 

(intl6 *)(pptlnil), * pointer to list of start points * 

(rnt32 ^(pwidthlnit)); " pointer to list of n widths " 

FOLLOWUP_RENDERfNG •* reslore state V 

To allow processes to acquire specialized information 
from the X server and to make specialized requests to the 
X server, a small number of extensions were added to th€ 
X server so that Starbase applications could: 

■ Register Starbase windows with the server 

■ Retrieve the current list of rectangles that define win- 
dows visible on the screen 

■ Set up an error handler 

■ Note changes to the hardware color map. 

Resource Sharing 

To facilitate the exchange of information between Star- 
base and X. and to allow multiple processes to share off- 
screen memory and ufher display resources efficiently, the 
graphics resource manager (GRM) was developed. The 
GRM does not access the hardware directly because it is 
designed to function as a notepad on which Starbase and 
X can both write information regarding their use of display 
resources. The GRM a 1st) keeps track of shared resources 
so that both X and Starbase applications can coexist on the 
same display. See the article on page 12 for more informa- 
tion on the graphics resource manager. 

Starbase X11 Architecture 

Fig. 3 depicts the basic software architecture for the Star- 
base/Xll Merge project. The figure implies that X and Star- 
base are both accessing the display at the same time. The 
design allows for any number of Starbase applications and 
any number of X clients to coexist on the same display. 



The role of I he GRM in Ibis figure Is to allocate resoun es 
among cooperating X server and Starbase processes, 

Fig. 4 shows the architecture of a "window-smart" 
graphics application that makes programmatic use of both 
Starbase and X from within a single program. This facility 
allows Starbase programmers to useX rendering facilities to 
enhance the usability and appearance ui 'their applications. 

Conclusion 

The Starbase/Xll merge project occurred in an era of 
increasing complexity in computer software. Software proj- 
ects are getting larger and more geographically distributed* 
This complexity is also being faced during a time when a 
new tactical model has emerged in the computer industry. 
Diverse groups (sometimes involving a company's com- 
petitors) are forming alliances to achieve a greater goal than 
Liny entity could achieve alone. The Massachusetts Institute 
of Technology X Consortium is a successful example of 
this new model at work. 
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Starbase/Xll Merge Glossary 



Because some of the terminology used here anc 
of the of the a^ 'ay be new or spec 

kse/X1 1 or they may be used before they are 
g terms are deiined 
Backing Store. Locations in or ad memory 

the contents of a wrndow are backed up if a window be 
obscured because of some window system or use' a 
Bit Map. A pjxmap having a depth of one On monochrome 
displays the X server maintains all pixrnaps as bit maps 
Clip List A hst of rectangles representing the obscured a no or 
unobscured areas of a window 

Clipstamp. An integer associated with a window, that is used 
to determine the current validity of a list of clipping rectangles 
associated with that wmoo* 

Color Map. A set of hardware registers that maintain the red- 
green-blue components of individual pixels. Pixel values, which 
are commonly in the range of to 255. serve as indexes into the 
color map. 

Combined Mode. An X server operating mode on the TurboSRX 
display in which the overlay and image planes appear as a 
single, integrated set to the user. 

Cursor. An indicator on ihe screen used to direct the user's 
attention. The X cursor (or input pointer) traverses the whole 
display, whereas Starbase cursors (commonly referred to as 
echoes) move within individual Starbase windows. 
DDX. Device Dependent X The portion of the X server devoted 
to handling device dependent I/O. 

DHA. Direc! Hardware Access. A method that allows a Starbase 
application to bypass the X server and render directly to the 
frame buffer 

Display Enable Register A hardware register that controls 
which planes of the display are Viewable Starbase and X use 
the display enable register to implement double buffering 
DIX, Devtce Independent X. A section of the X server that contains 
a scheduler, a resource allocator, a high-level color map, and 
code for handling window functions, such as cursors, events, 
extensions, fonts, graphics context, and rendering 
Drawable. A logical raster (on the screen or in memory) upon 
which X and Starbase can draw Windows and prxmaps are both 
types of drawabJes 

Double Buffering. A graphics technique to enhance the smoolh- 
ness of motion The technique works by using the display enable 
register to toggle between two buffers While one buffer is being 
rendered into, the other is displayed. When rendering to the 
hidden buffer is complete, the display enable register is changed 
and the hidden buffer is displayed and the previously displayed 
buffer becomes the new hidden buffer 
Frame Buffer. The video memory of a display device in which 
each element represents one picture element, or pixel The frame 
buffer is divided into two pans, on-screen memory (current image 
on the screen) and offscreen memory (graphics memory that is 
never visible) 

Graphics Context A self-consistent set of attributes such as 
foreground and background colors, line styles, and fill patterns 
which are used by X clients to specify how the X server should 
render the drawing requests it receives. 
Gopen (Graphics Open). The Starbase action of opening a dis- 
play device or window to create a virtual device that Starbase 
can render to. 



GRM. Graphics Resource Manager The GRM <s a process that 
handles requests from the X server and Starbase applications 
for d : ory and s 

Image Planes. The primary display memory . splay 

systems used for rendering com p (ex images 
MOM A Windows, Multiple ooscurabie movable, ana acceter^ 
ated windows Hardware logic in the graphics accelerator pro- 
vides very fast drawing and clipping of multiple windows 
Naming Conventions. The following conventions apply to proce- 
dures mentioned in these artic 

is a standard X library procedure (eg, XG&tWin- 
tiowinfo.i 

■ XHP<name> is an HP X-extension library procedure (etg„ 
XHPGetServerMode) 

■ xQs<name> is a procedure inside the X server, located in the 
translation layer between DIX and the X display drivers 

a e> without any prefix is typically an application-level pro- 
cedure, but must be interpreted in context 
Offscreen Memory. A portion of ihe frame buffer that cannot be 
displayed on the monitor. In all other respects, offscreen memory 
behaves the same as on-screen (visible) memory Starbase and 
X use offscreen memory to hold character, cursor, pjxmap, and 
scratch information for rapid transfers to on-screen memory. 
Optimized Font. A character set that has been placed into off- 
screen memory to increase its display output performance. 
Overlay Planes. Planes of display memory that are visually on 
top of or in front of the image planes These planes are disabled 
or sel to a transparent color to view the image planes 
Pixel. The smallest addressable picture elemenl of a display 
Typical HP displays have between one and two megapixels 
Pixel Value. A numeric value, typically between and 255, which 
determines the color of an individual pixel 
Pixmap. A hidden rectangle of raster data wh?ch is maintained 
in offscreen memory when there is room, and in virtual memory 
when there is no room in offscreen memory. 
Raster Data, A data structure described by a two-dimensional 
array of pixel values. 

Raw Mode, Running a Starbase application without any w*ndow 
system 

Rendering. Any form of drawing operation, including text, line, 
and raster output Rendering may occur lo on-screen memory, 
off-screen memory, or virtual memory. 

Sample Server. The X1 T server template source code made 
available to the general public by the X Consortium that enables 
X vendors to develop servers for their own products. 
Scanline. A horizontal row of pixels 

Shared Memory. A contiguous area of process data space that 
is shared with another process. The X server and Starbase appli- 
cations use shared memory for communication and sharing fonts, 
color maps, and other display resources 
Socket. A communications channel between two HP-UX pro- 
cesses There are two types of sockets: internet sockets, which 
are communication channels between machines across a net- 
work and HP-UX domain sockets, which provide faster communi- 
cation within the same machine 

Stacked Screens Mode, X Server op era lion on overlay and 
image planes in which the two sets of planes are treated as 
separate display devices 
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Stacking Order. An ordering imposed on a set of windows that 

represents the apparent visual ordenng of the windows to the 
user. For a window to be at the top of the stacking order means 
that Ft cannot be occluded by any other window. 
Tile. A pixmap replicated many times to form part of a larger 

pattern. 

Transparent Color. A pixel value in the overlay planes that 
causes the information In the image planes to be displayed in- 
stead of the information in the overlay planes. 
TurboS RX. A 3D graphics subsystem that includes a triple trans- 
form engine, a scan converter, a 16-bit z-buffer, four overlay 
planes, and up to 24 image planes. The TurboSRX also includes 
the microcode to provide interactive 3D solids rendering, photo- 
realism, and window clipping capabilities 
Virtual Memory. Memory that the HP-UX operafrng system allo- 
cates to an executing process. It is called virtual because al- 
though the memory appears to be in physical memory to the 
process, the system may swap it to and from a disk. The X display 
drivers are capable of rendering graphics images to virtual mem- 
ory as well as to on-screen memory 



Visual Type. The color map capabilities of a given display. Com- 
mon visuaJ types supported on HP displays include 1-bit static 
gray (or monochrome), 8-bii pseudo color (having 256 color map 
cells of RGB values), and 24-bit direct color (using 8 bits each 
for red. green, and blue values] 

Window. An on-screen rectangle of raster data that can be 
mapped (displayed), unmapped (removed), and rendered to 
XDL X driver interface. A set of entry points that exist in the 
device dependent section of the X server, which provide an 
interface between the server's translation module and the X dis- 
play drivers 

X Client. A program that interacts with the X server through one 
of the X libraries using the X client/server protocol 
X Protocol. The specification from the MIT X Consortium that 
precisely defines the behavior ol the X server in its treatment of 
clients, its handling of events and error conditions, and its render- 
ing operations. 



Managing and Sharing Display Objects 
in the Starbase X11 Merge System 

To allow Starbase and X to share graphics resources, a 
special process called the graphics resource manager was 
created to manage access to the shared resources. An 
object-oriented approach was taken to encapsulate these 
shared graphics resources. 

by James R. Andreas, Robert C. Ciine, and Courtney Loomis 



ONE OF THE CHALLENGES for the Starbase/Xll 
Merge project was designing an architecture that 
supports sharing of resources among X and Star- 
base applications. These HP-UX processes can realize sig- 
nificant memory savings by sharing resources such as 
character sets or fonts, X and Starbase also compete for 
private use of display resources. The architecture we de- 
veloped t called the graphics resource manager, or GRM. 
supports the allocation of shared resources and at the same 
time provides use of display resources by individual pro- 
cesses* 

The GRM consists of an HP-UX process and a library. 
The GRM library' is linked with the X server and Starbase 
applications and calls are made to the GRM library to com- 
municate with the GRM process. Fig. 1 shows the GRM 
architecture discussed in this article. The GRM handles a 
request it receives from the library and returns a response 



to the library. The library unpacks the response and returns 
the information to the caller, The GRM supports three 
modes of operation: 
w The X server operating alone 

■ A Starbase application operating alone without the sup- 
port of any window system 

■ The X server with a Starbase application running in a 
window, 

What Is Managed? 

When we began investigating the GRM architecture, we 
assumed that we would be allocating two basic resources, 
shared memory and offscreen memory. Shared memory is 
a memory resource supported by HP-UX 1,2 which can be 
attached to the address space of multiple processes. Each 
process can access the shared memory space directly. By 
using shared memory in the GRM architecture, one process 
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Fig. 1. The architecture of the graphics resource manager 

ran load character font information into shared memory, 
and another process can later use the font. 

Offscreen memory is a region of the display frame buffer 
that is not visible on the display screen. The frame buffer 
is the video memory of a display device dedicated to main- 
taining the value of the pixels. The X server and Starbase 
drivers use offscreen memory to optimize a variety of ren- 
dering operations. Many of HP's graphics hardware prod- 
ucts provide offscreen memory in various shapes and sizes. 
Fig. 2 shows an example of the frame buffer memory avail- 
able in the HP 98550A Color Graphics Board. The block 
mover hardware can be used to copy areas of the offscreen 
memory into visible memory. Pont glyphs* which deflBe 
the pixels to be turned on for a particular character font or 
set T are generally loaded into offscreen memory SO thai the 
block mover can be usee] to render the glyphs to a window 
at mtv high spn-ds I'ixmap patterns a$B also loaded into 
offscreen memory so that the block mover can be used to 
as of the screen using the pixmap pattern (this is 
how a window background is painted J. A pixmap is an 
array of pixel values (numerical values typically between 
and B5S) ihat determine the color of individual pixels. 

Offscreen memory is limited to the size provided by the 
display hardware. Additional memory cannot be allocated 



by the system, and so the allocation of offscreen memory 
must be done carefully. Other es can obtain off- 

screen memory for the storage of unique pixmaps. The 
pixmaps can subsequently be used for rendering opera- 
tions, such as a tile to fill a polygon, a background pattern 
for a window, or an image used frequently in a program 
pushbutton outlin 

Object-Oriented Approach 

When h ding how to implement the alloca- 

tiou of shared and private r for clients, we decided 

to use an object-oriented approach and encapsulate 
resources in objects. 3 The first thing we did was to identify 
the items we wanted to treat as objects. We identified three 
types of graphics resource manager objects and their attri- 
butes. 

■ Shared memory objects, which are used to share fonts 
or information about some aspect of the display or system 
state, 

■ Offscreen memory objects, which are used to reserve an 

of the offscreen memory resource. 
• Semaphore objects, which are used to share a system 

semaphore. The semaphore helps synchronize various 

processes 4 

The attributes of GRM objects are divided into t%vo groups, 
general attributes and specific attributes. The general attri- 
butes of a GRM object are a set nt fields that define the 
object's name, These fields are consistent among all GRM 
objects. The following shows the name fields for a GRM 
object, 

int class; t* class of object, client defined * 

devJ screen, ' screen device * 

int wi ndo w , * X w i ndo w id * 

char name[GR!VLMAX NAME„LENGTHj; * string identifier 

of object' 

devj device; f disk device for fonts V 

int mode; ■* mode of a font * 

int key: .* key of a font*' 

int partition ; f* partition of offscreen memory V 

The name of a GRM object is a conjunction of all the 
fields. Two objects may differ by as little as one value in 
any one of the fields. 

Object -specific information is added to the instance of 
an object. Por example, a shared memory object includes 
the specific size of the object and its specific location in 
the GRM shared memory segment, and an offscreen mem- 
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Fig. 2. Frame buffer memory in the HP 98550 A Color 
Graphics Card 






Fig. 3. Architecture for building the GRM into the X server 
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described by its specific width, height, ajid 
depth, as well as its specific location in three dimensions 
in offscreen memory. 

Operations on Objects 

Tjhe GRM supports a set of operations that can be per- 
formed on the objects in a consistent way. 

■ GrmCreateObjecL The GrmCreateObject tunc lion allocates an 

iijfci of t lie requested class with the object inslan < 
allocates the requested resource* and adds thn client to 
the list of clients that are usin^ the object. If the object 
already exists or cannot be created, the GRM returns an 
error, The client may then share the object, if it desires, 
by calling the GrmQpenObject junction. 

■ GrmOpenObject. If the described object already exists 
I from calling GrmCreateObject). the client is added to the 
lisl of clients that are sharing the object. The GRM then 
passes the object's attributes back to the client, If the 
object doesn't exist, the GRM returns an error. 

m GrmCJoseObject. The GrmCtose Object function causes the 
GRM to delete the client from the list of clients that are 
sharing the object. When all clients have lost interest in 
an object, the object is destroyed, and the object's re- 
sources are freed. 

Each function is an atomic operation because no other 
operation is allowed to be performed while one is in pro- 
gress. As the project progressed il became necessary to 
group several of these Operations into one large atomic 
operation. Functions were added to mark the beginning 
and end of these larger transactions. 

The GRM also supports a function to find and list the 
objects it has created, To query the existence of sets of 
objects, the client can supply an object name with the fields 
set to match the value fields in other objects. This function 
is primarily used for debugging purposes. 

Design and Implementation 

The project teams Investigated three main architectures 
to determine the best design: 

■ Build the GRM into the X server. One of the first arch (lec- 
tures we examined was building the GRM functionality 
into the X server, hi this architecture, the Starbase pro- 
grams would communicate with the X server to allocate 
resources. Pig. 3 shows this architecture* We did not 
choose this architecture for several reasons. One reason 
is that the X server is used primarily as a rendering 
engine. The X server could be busy for many seconds 
performing a rendering request, causing the Starbase 
client to block until the X server could process a request. 
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Also, GRM functionality would become dependent on a 
particular software technology in the X server* which 
may change as enhancements are made to X. Another 
problem occurs when a Starbase application is running 
alone in raw mode. The X server would have to be exe- 
cuted to supporl the Starbase client even though the X 
Window System operation was not desired, 

■ Construct the GRM as a library. The second architecture 
the team examined implemented the GRM as a library, 
which could be linked into the X server and Starbase 
clients (see Fig. 4). Resource allocation would be per- 
formed by the library with multiprocess communication 
done through a single shared memory segment. With 
this scheme, allocation of objects could be done very 
quickly. The allocation operation would consist of di- 
rectly manipulating data structures in the shared mem- 
ory segment. This model was not chosen because of con- 
cerns about its ability to support future upgrades, and 
because it relies on consistent operation among all im- 
plementations that manipulate the shared memory infor- 
mation. We fell that we could achieve more robustness 
by choosing a protocol-based communications model. 
To support future vers inn changes in this model, the 
data structures would have to be designed with built-in 
flexibility and version information. Proving that a newer 
version of the GRM library would work properly with 
older versions and vice versa would have been very dif- 
ficult. 

■ Form the GRM as an independent process. Alter consid- 
ering the previous two models, the project team settled 
on implementing the GRM as an independent process. 
The independent process model is shown in Fig, 5. The 
independent process model provides logical isolation 
between the GRM and its client processes (the X server 
and Starbase processes). The GRM process is free to define 
its data structures for allocating objects without worrying 
about access to these structures by the client processes. 
This architecture also enabled the designers of the GRM 
to be flexible in the algorithms used to allocate the objects 
without worrying about backwards compatibility w r ith 
previous versions of the GRM, The protocol between the 
GRM and its clients is also typed with a version number, 
and the protocol data structures are padded to maximize 
the potential upgrad ability of the GRM services. 

Interaction with X and Starbase 

Communication with the GRM is originated by either 
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Fig. 4. Architecture for constructing the GRM as a library 
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Fig. 5. Architecture for constructing the GRM as an indepen- 
dent process. 
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the X server or the Slarbase display driver. The GRM pro- 
cess works by receiving a request, processing the request, 
and then returning a reply message to the requester. An 
application can perform both X library calls and Slarbase 
library calls. This results in activity by bat} erver 

and the Starbase driver. To get their work done* these GRM 
clients can call functions in the GRM library to create or 
open objects. The operation Is synchronous because the 
:ir is blocked until the operation is completed by the 
GRM. 5 The GRM library packages the client request and 
sends the request to the GRM process. The GRM process 
processes the request, and if it is asked to create an object, 
allocates the resources for the object. The clienl is 
added to the list of clients \w I Finally, 

the GRM process returns a reply, which is received by the 
GRM library. The GRM library unpacks the reply and re- 
turns information describing the object to the caller. 

The GRM process never directly modifies data in the 
GRM shared memory segment or in the display hardware, 
The GRM process instead acts upon an "abstract view' 1 of 
these resources. The GRM maintains a data structure rep- 
resenting the available resources in the GRM shared mem- 
ory and the display hardware offscreen memory (see these 
data structures in Fig. I), When the GRM process allocates 
an object it updates the associated data structure, 

Allocation of Offscreen Memory 

Currently, all HP display devices supported by the HP 
9000 Series 300 and Series 800 product lines provide an 
offscreen memory resource. This memory is configured on 
the device as an extension to the memory used to hold 
viewable information on the display* Since display mem- 
ory has a width, a height, and a depth, the offscreen memory 
also has these dimensions, This complicates the I 
of this memory because the GRM memory manner must 
allocate three-dimensional objects. Offscreen memory is 
relatively easy hi manage if only one process wishes lo 
display data on the screen at a time. However, in the Star- 
base/Xll Merge architecture, multiple applications share 
the display device, so managing the sharing of the offscreen 
memory resource is quite a challenge, On some HP display 
devices, pixmaps of varying depths can be allocated. Also, 
some display devices require that the pixmaps be aligned 
on pixel boundaries lor efficient access. The challenge is 
to be able to allocate an arbitrarily *tzed and aligned three- 
dimensional box out of an arbitrarily sized ihree-dinnn 





Available Space 



Size of Block 
to Be Allocated 



sional box of free space. In addition, the algorithm must 
efficiently deal with the resulting free space for future al- 
locations. 

Thre^-dimensional objects are typically perceived as 
spheres and polyhedra of various shapes and sizes. PLx- 
maps are represented as three-dimensional objects as 
sided blocks* A pixmap generally has a uniform width, a 
uniform height, and a uniform depth. The GRM algorithm 
add; si such pixma 

The Two-Dimensional Case 

Three-dimensional allocation is best explained as an ex- 
tension of the two-dimensional case. The following discus- 
sion of the two-dimensional case will show that the addi- 
tion of a third dimension is a fairly simple extension of 
the two-dimensional philosophy. 

We start with the two boxes shown in Fig, 6, Box A 
represents the available memory resource and box B is the 
space to be allocated out of box A. If box B is placed inside 
box A + the rest of A can he divided into any of the config- 
urations shown in Fig 7. 

Configuration 1 produces a Sol of fragmentation of the 
free space. This fragmentation alone is enough to discount 
it as a viable option. This leaves configurations 2 and A, 
There is only one difference between these two configura- 
tions and that concerns how memory is globally allocated. 
With configuration 2. free space is cut into vertical strips 
which results in memory being allocated in vertical strips, 
and in configuration 3. free space is Cttl into horizontal 
strips which results in memory being allocated in horizon- 
tal strips. In general, it makes little difference which con- 
figuration is chosen. For software used on Hewlett-Packard 
workstations, there is a reason to use horizontal strips. 
Fonts are stored as horizontal strings of characters, Since 
caching fonts is a major use uf offscreen memory, configura- 
tion 3 was chosen as the optimal solution. 

Adding a Third Dimension 

Adding a third dimension to this problem means taking 
the two-dimensional view and adding the concept of a 
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Fig. 6. Two-dimensional allocation of offscreen memory Box 
A is the available memory and box B f$ the space to be 
allocated from box A. 



Configuration 3 

Fig. 7. Different memory allocation configurations possible 
when space for a box is allocated from a larger box. 
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Eronl and a back to the object being allocated (see Fig. Ba|, 
As with the two-dimensional model, there are a few ways 
to handle breaking off front and back pieces to make effi- 
cient use of the resulting space. Each method results in six 
blocks and one allocated block out of the original block 
of memory. To coalesce the blocks when an allocated block 
is freed, the GRM 8$&ociat08 din free blocks resulting from 
an allocation with the al located block (see Fig. 8b). With 
this scheme, when the originally allocated block is freed, 
the blocks that can coalesce with it are easily found. 

The allocated block forms a node of a tree, with the leaves 
of the tree initially being free blocks. Ww requests for 
offscreen memory catise one of the surrounding blocks to 
be allocated, with the result being that the new allocated 
block becomes a node with a new set of leaf blocks showing 
the free areas — that is. the tree grows "' I igj 9). As blocks 
are freed the tree shrinks as leaves are coalesced with parent 
nodes. For efficient access* the GKM maintains a listoi Free 
blocks. This list optimizes the search for the best-sized free 
Muck to satisfy an a 1 location request. 

The GRM Daemon 

Tlie purpose of the GRM process, or daemon, is to manage 
the allocation of graphics display hardware resources for 
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Fig. 8. (a) Two-dimensional views of an allocated block with 
front and back added (b) Data structure representation of 
an allocated block with the six free blocks from the original 
block of memory 



all proces \\ want to use these rej& such, it 

itMihUin- a i ompreheiisrve list of the resources thai have 
be* n allocated to these processes. The GRM daemon can 
only perform this task correctly if it can be certain lhal 
there is onlv one GKM daemon process that is allocating 
resources to all applications requesting them, 

Typical daemon processes are started by an initialization 
script at system boot time. In this situation, uniqueness of 
a daemon process can be easily assured by avoiding niul- 
liple invocations ul the script t hnt starts the daemon pro- 
cess. However, I he situation for I he GRM daemon is differ- 
en! because the GRM daemon is not started at boot time. 

Since the GRM daemon has a specialized purpose, 
preferable to have it executing only on an as-needed basis, 
rather than running continuously as would be the i ase if 
it was started at hoot time. The GKM daemon is therefore 
designed to be spawned only by a process that requires 
access to the display hardware CM course, it is only n 
sary to spawn a GRM daemon process if one has not already 
been put into service hv another graphics application, 

The design of the Starbase/Xl 1 system dictates that the 
X server and all LStarhase applications absolutely depend 
on the proper functioning of the GRM daemon. As such, 
the design of the GRM daemon required a foolproof method 
to ensure that for a particular host system, exactly one GRM 
daemon is given the task of mediating the use of all display 
hardware associated with that host, even when two or more 
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Rg. 9. Allocation of a new block. The new allocated block 
becomes a node with six leaves, which represent free blocks. 



16 HEWLETT-PACKARD JOURNAL DECEMBER 1989 



)Copr. 1949-1998 Hewlett-Packard Co. 



applications attempt to spawn a GRM daemon simultane- 
ously. 

The HP-UX Semaphore System 

iniple solution to the problem of guaranteeing urn 
ness for the GRM daemon process is to use a semaphore 
to ensure that only a single daemon has permission to 
continue as the resource manager Potent -ral GRM 

daen could be sta ..lltaneously. each 

kg to test and set the GRM daemon semaphore. The 
GRM semaphore mechanism ensures that only one of those 
processes actually succeeds in the test and set operation, 
with the remaining processes being obligated to recognize 
that another GRM daemon process is principal and tu^xii, T 

Using a system semaphore to implement this scheme 
would have been trivial had it no! been for a limitation in 
the behavior of thi' HI rem semaphore during the 

creation of a semaphore * * This limitation is that the value 
of a semaphore alter its creation is not defined. 

While the operating system does provide an atomic op- 
eration lor Greeting a system semaphore exclusively (the 
operation succeeds only if the semaphore does not air. 
exist), it does not guarantee the slate ol the newly created 
semaphore to be any particular value. Therefore, a process 
can know that it has created a previously nonexistent sys~ 
tern semaphore and that it must initialize the value of the 
semaphore, but a separate process cannot know that a given 
semaphore has just been created and is not yet initialized. 
Since the creation of a semaphore and the initialization oi 
its value is a Iwo-step process, it is conceivable that another 
process might attempt a semaphore operation between the 
creation and initialization steps. For an application such 
as the GRM daemon, this limitation presented a severe 
problem that required a substantial workaround. 

'Y\\r. problem wilh the system semaphore i.^n be clarified 
with an example [see Pig, 10). Consider I he situation where 
two GRM daemon processes (process A rind process B) 
have been Started and I hey are both attempting to create 
and then test and set the GRM semaphore. Suppose thai 
process A successfully creates Ihe GRM semaphore. Before 

'In ttifi con!fl*1 of Erufi arlich- ■ - tari have a value of zero or or>e A semapfiore 

is initialed lo a value ol i • I set opera! ■'.'> u£C0$& Ortfy 

srtd ^etoperaifonresuifs 
laviflg a value ol one The process ol testing and ttitii Q '' I 
the semaphore lu sai Jo oe an atomic opt i 



\ has had a chance to initialize the value of the 
semaphore it is preempted by the kernel's scheduler. Pro- 
cess B then comes along* notices that the GRM semaphore 
already exists, and attempts a test and set operation on the 
semaphore which currently has an undefined value. The 
test and set O] may or may not succeed depending 

on the random value of the semaphore. However, if it does 
succeed, process B will think that it has been designated 
as the principal GRM daemon and carry on as such. Mean* 
e, process A has regained the processor and proceeds 
to initialize the value of the GRM semaphore, overwriting 
fleet of the test and set operation of process B. Sub- 
sequently, process A will successfully execute a test and 
set operation on the GRM semaphore resulting in two GRM 
daemon processes running when there should only be i 

Various workarounds to the semaphore initialization 
problem were attempted, but none of them that exclusively 
used system semaphores would work because it could not 
be determined whether or not the value of a semaphore 
was valid. A colleague who had experienced similar prob- 
lems with system semaphores suggested that a file lock*** 
i uukl be used as the GRM semaphore. Besides being used 
to control access to a file, file locks can be used in an 
advisory capacity in much the same way as a system 
semaphore. File locks have the advantage that the test and 
set operation does not require the two-step (not atanik. | 
"create and initialize" procedure used by system 
semaphores. However, file locks can he d iff i cult to manage 
when the file being used as the subject of the lock, that is, 
Ihe lock file, is not writable or is transitory. As such* the 
GRM daemon uses a file lock only as a means to control 
access to the system semaphore, and the semaphore Is re- 
sponsible for awarding a single GRM daemon process the 
guarantee o\ uniqueness. 

The GRM Daemon Semaphore System 
As mentioned earliea the purpose of the GRM daemon 

semaphore system is to ensure that exactly one GRM 

ion process successfully claims responsibility for man- 

aging the allocation of the display hardware- The system 

must be reliable in Ihe late of an arbitrary number of com- 

semaphore confattna 10 ihe AT&T UNI>' System v DefiniHon 
""A \\\e loch it a Nte system semaphore associalDd Mrttfi 3 segment of a pafttouRW hl^ 
'-. hie 
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(1) For this example, Ihe GRM semaphore has an incidental value of zero following 
its creation. In general, an HP-UX semaphore has a random initial value. 

(2) Since the GR M semaphore had an initial value of zero, the test and set operation 
succeeds 

(3) The effect of the test and set operation of GRM daemon process B at lime T3 is 
nullified by the initialize operation 

(4) Each of two GRM daemon processes has successfully tested and set the GRM 
semaphore. The semaphore thus falls to allow only one process to continue 
as the principal GRM daemon, 



Fig. 10. Timing diagram of a 
semaphore failure at initialization 
of two processes. 
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peling infant (JKM daemon processes (processes lhai have 
not yet established their principal status]. The system must 
not require user intervention even in the face of an ungrace- 
ful exit by a GRM daemon process. 

Given these considerations the GRM semaphore system 
was designed In accommodate the following situations: 
■ Any process killed without opportunity for a graceful 
exit. Tins means that the design must be able to re< q 
when the GRM system semaphore and/or lock file are 
left around after the process that created them is termi- 
nated by other than programmatic means. 
• An arbitrary number of infant GRM daemon pxoce 
attempting to claim principal | unique) status simultane- 
ous! \ . 
a An existing GRM daemon process holding the GRM 
semaphore while in the process of exiting. 
A GRM daemon process has three phases. Its first phase 
i: during the initialization of an application requiring the 
services of a GRM daemon. The second phase is during its 
attempt to set the GRxM semaphore and claim pita ipal 
status. TtfiB third phase is the operation. iJ phase, when it 
is assured uniqueness and carries out the tasks required of 
the disti lay hardware resource manager. 
The Application Phase. Srarbase applications and the X 
server must have an executing GRM daemon to function. 
During initialization, a GRM library routine within these 
programs attempts to make a connection with the GRM 
daemon through its designated sockel address.'" If it fails 
to make the connection, the routine assumes that I here is 
no GRM daemon process executing and it spawns a GRM 



ess. The spawned process "daemonizes" itself 
(detaches from any terminal or parent process), sets its user 
identification number, and then attempts to establish itself 
as the only GRM daemon process. 

Claiming Principal Status. Immediately after an infant 
GRM daemon process is daemon ized, il proceeds in \\> 
attempt to become the only GRM daemon process, Fig. 11 
shows the liming diagram for two processes [processes B 
and C) trying to claim principal status and control of the 
GRM semaphore. 

[tie first step is to test and set the file lock, thereby 
claiming exclusive access to the GRM semaphore. In this 
way. if the semaphore does nol already r xisl Ihm Ihe GRM 
daemon can create the semaphore and Initialize its value 
without fear that another GRM daemon process may he 
trying to access the semaphore at the same time. The lock 
file, which is used as the subject for the file lock, must be 
created if it does not already exist. If the file already exists, 
either another process is trying to access the GRM 
semaphore or a process was killed while attempting such 
access. 

The next slep is to see if the GRM semaphore exists. If 
the semaphore already exists, then it is known to have a 
valid value. Phis is true since any GRM daemon process 
thai created the semaphore is by convention guaranteed to 
have initialized its value before releasing the file lock. If 
ihe GRM semaphore does not ex is!. Iheo it is created and 
initialized with a valid value. Since the process is holding 
the file lock, it need nol worry about another process at- 
tempting to test and set or initialize the value of the GRM 
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(1) GRM daemon process A holds the GRM semaphore. 

(2) GRM daemon process B holds the GRM fife lock. 

(3) GRM daemon process C holds the GRM file lock. 

(4) The test and set semaphore operation incl udes the creation of the semaphore if it 
doesn't already exist. The creation, testing, and setting of a semaphore can 
he considered to be an atomic operation since all of these operations are 
executed while holding theftle lock andoniy one process can be holding the file 
lock at a given time. 

(5) GRM daemon process C holds the GRM semaphore. 

(6) A retry cycle includes setting the f i le lock, testing the semaphore, releasing the 
file lock, and a short sleep. 

(?) Concede and exit (i.e., time out) after enough time has elapsed during the retry 
cycle to allow an existing GRM daemon process to service a disconnect request 
from its last client, free various resources, remove its listening socket, and 
remove the GRM semaphore. 

Fig. 1 1 . Timing diagram for the GRM semaphore system. 
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semaphore- 
Once the existence of the GRM semaphore is established 
and it is known to have a valid value, an attempt tan be 
made to test and set the semaphore. If successful T the 
semaphore is then held by the process that set it. Once the 
semaphore is set. the lock file can be removed, allowing 
other processes to create a new Jock file in order to access 
-* j maphore. The life of the lock file is generally limited 
to the duration of the creation and initialization of the GRM 
semaphore, 

If the test and set or any of the preceding operation 
not successful, then tin must be released to provide 

other infant GRM daemon processes the opportunity to 
access the GRM semaphore. After the file lock has been 
released, the infant GRM daemon process will sleep for a 
short period of time and then retry the entire procedure. 
The sleep duration is short enough to expedite the GRM 
daemon startup procedure. However, the retry loop results 
in a delay that IS long enough to i-ns ore that there Is enough 
time for an exiting GRM daemon to finish its exit and clear 
the GRM semaphore. 

Operations Phase. Once established as the principal GRM 
daemon process, the GRM daemon goes about initializing 
its data structures and opening Its listening socket to begin 
serving its purpose, One or more GKM clients will I hen 
make connections to the GRM daemon and request disjil.n 
hardware resources as needed. When a GRM client exits, 
its connection with the GRM daemon is closed and the 
resources allocated to it are freed and made available to 
other GRM clients. 

When ihe GRM daemon detects the absence of its clients, 
it removes the listening socket, removes the semaphore, 
and then exits. Any GRM client that may be starting up at 
this time will fail to establish a connection, which includes 
verifying the connection with a full handshake, and it will 
start file process oye? again by spawning a new GRM 
daemon. 



Conclusion 

The GRM provides a means for allocating a system's 
display resources among various competing clients. The 
{ ,KM also provides a means of sharing information among 
the clients through the encapsulation of the information 
in objects. One client can access an existing object if it 
knows the name of the object, even if the object was created 
by another client. The client c<s 'he data by asking 

the GRM to open the object. The GRM also provides a 
sophisticated memory allocation mechanism for the scarce 
offscreen memory resource. The mechanism includes a 
means to coalesce freed fragments of offscreen memor 
reuse. Finally, the design of the GRM interface ensures that 
only one GRM daemon process runs on a given system, 
even though several clients initiate access to the GRM 
simultaneously. 
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Sharing Access to Display Resources in 
the Starbase/X11 Merge System 

The StarbasefX1 1 Merge system provides features to allow 
Star base applications direct access to the display hardware 
at the same time X server clients are running. There are also 
capabilities to allow sharing of cursors and the hardware 

color map. 

by Jeff R. Boyton n Sankar L, Chakrabarti, Steven P. Hiebert, John J, Lang, Jens R. Owen, Keith A. 
Marchington, Peter R. Robinson, Michael H. Stroyan, and John A, Waitz 



HP'S GRAPHICS DISPLAY HARDWARE provides 
many display resources that must be carefully man- 
aged to maintain order on I he display when compet- 
ing HP-UX processes, such as the X server and Starbase 
applications, are attempting to access the display hardware 
at the same time, The hardware resources that must be 
shared among these processes include the frame buffer 
I video RAM |, i ursors* fonts, and the color mil p. This article 
discusses methods used to allow Starbase applications and 
the X server to share access lo this common poot of hard- 
ware resources, and a method called direct hardware access 
(DHA), which enables Starbase applications to achieve high 
performance when accessing the display, while maintain- 



ing the integrity of the X Window System. 

Display Hardware 

Fig. 1 is a block diagram of a typical graphics display. 
This is a generalized model and does not represent the 
implementation of any particular graphics product. Some 
elements arc optional — for example, only 3D systems need 
a z-buifer and some low-end graphics systems have no 
graphics accelerator, 

Graphics Accelerator. The graphics accelerator provides 
specialized hardware to perform graphics operations on 
commands and data from the display driver running on 
the host system. The fundamental job of the accelerator is 
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Fig. 1, Block diagram of a typical 
HP hardware display system. 
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to apply viewing and modeling transforms and light source 
models to the data to convert It into a format usable by the 
scan converter. The scan converter consists of hardware 
for the generation of pixel data that represents polyline 
and polygon primitives. Operations an more than one win- 
dow are supported by the window control logic, and hidden 
surface removal is provided by the z^buffer. The accelerator 
also has respousibiiiiy for the control of most other hard- 
ware resources in the graphics processor, such as the con- 
figuration of the frame buffer and color map. 
Frame Buffer, The frame buffer is a specialized (usually 
dual-ported | HAM. Bach addressable location in the frame 
bufft picture element, or pixel. Some por- 

tion of the frame buffer is displayable, so its contents rep- 
;it the current image on the screen. Pixel values are 
read sequentially from the frame buffer and converted to 
a video signal by the color map and its associated circuitry. 
The entire frame buffer can be scanned as many as f>n times 
pt*r second to keep a steady image on the monitor The 
portion of the frame buffer that is nol displayed is railed 
offscreen memory. Special circuitry called a block mover, 
which Ls located in the frame buffer controller, is used to 
copy a rectangular region from one place in the frame buffer 
to another- Both the on-screen and offscreen portions of 



the frame buffer are accessible to the graphics accelerator, 

The frame buffer is physically separate from system RAM " 
but it is mapped into the virtual address space of all pro- 
cesses that access it. Therefore, it is possible for several 
processes to have the same physic al RAM of the frame 
buffer mapped into their virtual address space {see Fig. 2). 
This requires that processes must cooperate when making 
modifications to the frame buffer. The methods we use to 
share the frame buffer are discussed later. 
Color Map. The color map is a very high-speed lookup 
table that maps the numbers stored in the frame buffer to 
the actual color values. The user specifies the mapping 
with commands like: the number 5 in the frame buffer 
represents the color a,b>c where a,b,c are the intensities of 
red. green , and hi ue that must be mixed to create the desired 
color. After looking these intensities up, the color map 
converts them to analog voltages and sends them to the 
monitor. 
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Direct Hardware Access 

In the X Window System, user processes, or clients, do 
rmt render directly to the frame buffer. To gain S£C0SS to 
the frame buffer, clients make drawing requests to the X 
server, which is the only process with access to the frame 
buffer. The server has control and knowledge of the state 
of the frame buffer. However, lo achieve maximum perfor- 
niam e and functionality, some clients, such as Starbase 
applications, require direct access to the frame buffer. To 
!in-ct access lo I he display in an organized way, a 
client must cooperate with the server. The client must ob- 
lain Information from the server about the areas of the 
frame buffer that represent the visible area of the client's 
window and all rendering by the client must be clipped 
lo this area, This is dour li ri >| nesting the server to register 
an existing window for direct hardware access [DHA J. In 
response lo this request the server sets up mechanisms to 
pass the clipping information to the client and to update 
it as necessary. 

Two methods are used to pass information from the 
server lo I he DHA client: shared memory and IIP X exten- 
sion library calls. Graphics resource manager shared mem- 
m\ i£ used Inr inlornialjnn that does not change in size, 
such as ihe cursor sldte and fonts, Variabte~si2e data sui.h 
as the clip list is passed to the client via HI 1 X extension 
library calls (routines with an XHP prefix). Using shared 
memory for variable information would create shared mem- 
ory Fragmentation problems, and the overhead of convers- 
ing with the graphics resource manager (GKM'l, which man- 
ages the shared memory area used by X server and Starbase 
processes, could cause performance problems. The com- 
munication links between a DHA client and the X server 
are shown in Fig. 3. 

Data Structures 

Fig, 4 shows the data structures in GRM shared memory 
and process private memory that allow direct hardware 
access by Starbase DHA appli call oris. Pictured are the da la 
structures that would exist for one window on one screen. 
Multiple windows, color maps, and screens are supported 
and many of the structures shown are replicated in such 
circumstances. The X server and the Starbase processes 
have pointers for accessing the data structures in shared 
memory. The data items shown in Fig. 4 will be referenced 
and explained in later sections of this article. 

Opening a Window 

To allow a Starbase DHA process to be ported to run in 
X with little or no source code changes, it is important that 
the normal gopenf) procedure work the same way it does 
when the application is drawing directly to a graphics dis- 
play. 

The following activities occur during a Starbase open 
(gopen()) of an X window: 

■ If it is not already running, the graphics resource manager 
is started so that the Starbase process can access shared 
memory objects resulting from a DHA window registry - 

■ Tests are made to determine if the pathname parameter, 
which names the window 7 , refers to an X window or one 
of the other supported objects of gopen(). 



■ If the object being opened is an X window, the host 
name, the display identifier, and the screen number are 
obtained. II a d rivet -level socket connection to the server 
for that screen does not exist, one is opened. 

■ If Ihe window is lo be an accelerated window, an ac- 
celerator state identifier is generated. 

■ The XHPRegisterWirtdowO procedure is called. If it suc- 
ceeds, then a data structure (DHA window object] will 
be created in shared memory that contains the registered 
window information, 

■ The frame buffer is opened and mapped into virtual 
space using the device pathname returned by the win- 
dow registry call. 

■ The registered window object and other IJHA shared 
memory objects, such as the DHA screen object, the dis- 
play state, and the X server's cursor state, are opened. 
These data structures are shown in Fig. 4, 

■ The initial clip list for the window is obtained from the 
server. 

Registering for DHA Access 

An HP X library extension. XHPRegisterWindowQ, was 
added lo the server lo allow a client to request DHA access 
lo a window. The client passes the identification numbers 
of the desired window and the screen containing the win- 
dow. Additionally, the client may request that Ihe window 
be registered for use by a graphics accelerator. Upon receipt 
of the registration request, the server requests the graphics 
resource manager lo create a structure in shared memory 
and fill it with information pertinent to the window. In 
Fig. 4 I his structure is called the DHA window object. The 
information in the DHA object for each registered window' 
includes: 

■ Clipstamp, An integer counter lhal is incremented 
whenever the clip list for the window changes. This is 
used as a trigger to the client that it needs to obtain a 
new clip list via the XHPGetClipList() library procedure, 

■ nUsers. An integer value representing the number of reg- 
istrations active against the window. 

■ n Accelerated. An integer value representing the number 
of accelerated registrations active against the window. 

■ wjndowjd. An integer value representing the server's iden- 
tification number for the registered window. 

After the DHA window object is created , the server passes 
its GRM shared memorv identification back to the client. 
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Fig. 3. Communication paths between a DHA client ana the 
X server. GRM = graphics resource manager 
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The client obtains access to the DHA window object in 
GRM shared memory and reads the information supplied 
by the server- As the slate of the window changes, the 
information in the DHA window object is updated by the 
sen 

A window may be registered for DHA access multiple 
times by the same client or by multiple clients. All registra- 
tions use the same shared memory objed (DHA window 
object], A count is kept of the the number of current regis- 
trations on a window. A client terminates a registration 
with the library procedure XHRUnRegisterVvmctowO- When the 
number of registrations drops to zero, the server requests 
the GRM to delete the shared memory object and the win- 
dow is no longer directly accessible by clients. 

The Clip List 

The visibility and position pj the registered window can 
change at any time. The user can partially obscure the 
registered window with another window, move it to another 



area, iconify it, and so on. The dip list is a lisl of rectangles 
describing the areas of the window that are visible or 
obscured, In Fig. 5a window A is partially obscuring win- 
dow B. Window A is completely visible and its clip list 
consists of only one rectangle. The clip list for window B 
consists of three rectangles, two visible and one obscured. 

The clip lis* r hat can be as small as one 

rectangle (the window is fully visible) or as large as several 
hundred rectangles. Rather than pass Ibis information 
through shared memory, it is the responsibility of the DHA 
client to request the list via a library procedure. The 
ciipstamp. which is created when a DHA client registers a 
window, provides a last mechanism to notify all interested 
DHA ( lients when the clip lis! changes and they need to 
obtain a new clip list. 

Whenever the clip list For a window changes because of 
events such as a window move or stacking order change, 
the server increments the clipstamp field of the DHA window 
object. When the DHA client wishes to render, it compares 
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the clrpstamp in the 1JHA window object againsl •! IqeaJ 
copy, tf they differ, the client knows the clip list has 
changed since its last rendering operation and it must re- 
quest a new clip list. After making the request, the client 
copies the shared memory value of the clipstamp into its 
local copy for the next time. This mechanism avoids syn- 
chronization problems because no client ever dears (he 
dipstamp field. Multiple clumls sharing the same window 
merely bring themselves inlo synchronization with ihe cur- 
rent clipstamp value. 

To obtain a new clip list, ihe client uses I he library pro- 
cedure XHPGetCfipUstf). The client passes to the server the 
identification numbers ol the registered window and the 
screen containing Ihe window. The procedure returns to 
the client the following information: 
m x T y. Integer values representing the origin [upper left 

corner) of the window. This value is relative to the origin 

of the screen. 

■ Width. An integer value representing the width in pixels 
of the window. 

Height. An integer value representing the height in pixels 
of the window. 

■ Count. An integer value representing the number of rec- 
tangles in die clip list, 

■ Clip List Pointer. A pointer to a list of rectangles con- 
stituting the clip list. 

The DHA client knows the size of the frame buffer and 
where the frame buffer's physical memory is mapped in 
its virtual memory space. By using this informal ion in con- 
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Fig. 5, (a) Two overlapping windows showing their positions 
m screen coordinates, (b) The dip lists for the overlapping 
windows m window-relative coordinates where X1.Y1 = upper 
left and X2Y2 = tower right. X2 and Y2 are one p/xe/ outside 
of the true boundary to make the mathematics easier. 



[Unction with the origui and size of the window, the client 
can index into the frame buffer and calculate the memory 
addresses it is allowed to access. 

The onion of the rectangles in the clip list covers the 
n-n durable area of the window. Each rectangle is specified 
by ihe x,v coordinates of its upper left find lower right 
corners. The values oi ihese coord i nates are relative to the 
origin of Ihe window (see Fig. 5b). Each rectangle is marked 
as either visible or obscured. Visible mc I a ogles are visible 
on the screen. Obscured rectangles are not visible because 
they are either obscured by another window or are partially 
off the screen, The client traverses this list, rendering into 
the visible rectangles. If the window has no backing store, 
which is a location in memory for backing up windows 
that become obscured, rendering to the obscured areas is 
discarded. If the window has backing store available and 
the client can render to it, then rendering to obscured rect- 
angles is diverted to the backing store. Backing store is 
discussed in detail later in this article. 

The client can request a clip list in one of Ihree formats: 
YXBANDED. VISIBLE, or OBSCURED. In Ihe YXBANDED format, 
both visible and obscured rectangles are present in the list 
(see Fig, 6), They are split and ordered so that all rectangles 
with ihe same y-origin will have the same height, thus 
cheating bands across the window. Rectangles in the same 
band are sorted by increasing x-origin value. This type of 
ordering can enhance performance when rendering is done 
by filling horizontal scan lines. In the VISIBLE and 
OBSCURED formats, only rectangles of the respective type 
are present in the list. They are coalesced into the fewest 
possible number of rectangles and are not ordered. These 
formats are useful for displays that have hardware clipping 
capabilities. 

A DHA client can use the X facility XSetCiip Recta ngles() to 
restrict rendering to a subset of the window. If the graphics 
context containing the client clipping is specified lo the 
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XHPGetGipListO function, the resultant clip list will he re- 
stricted to that subarea. 

MOMA Windows 

Multiple, nbscurable. movable accelerated windows, or 
MOMA windows, refers to the hardware logic in the gra j 
accelerator that provides very fast drawing and clipping 
of multiple windows, The HP 98556A 2D Integer-Based 
Graphics Accelerator and the HP 98732 A 3D Graph u 
celerator contain graphics accelerator engines that use 
hardware facilities for clipping. When a DHA client wishes 
to use an accelerator to render into a window, it registers 
the window as accelerated. For some devices, such as the 
HP 98556A, this also implies that the server will allocate 
a MOMA hardware clipping state on behalf of the client 
For other devices, the DHA client allocates the clipping 
state. 

When the clip list for an accelerated window changes, 
the server downloads the new clip list directly into the 
MOMA hardware on behalf of the client. However, there 
may be reasons why the DHA client must also be able to 
load the clip list directly into the accelerator. For example, 
on the HP 98732 A 3D Graphics Accelerator, the clipping 
rectangles for only a single window are stored on the de- 
vice. As graphics contexts are swapped into the accelerator, 
appropriate clip rectangles must be loaded into the MOMA 
hardware. When the server is able to maintain the clip lis! 
state in the accelerator, the accelerated DHA processes are 
able to achieve a steady throughput because they do not 
have to spend time downloading clip lists. 

The server itself does not lake advantage of graphics 
acceleration. There are two reasons for this. Currently no 
graphics accelerators render according to all the X specifi- 
cations. More important, HP's accelerators are basically 
first-iii, first -out queues — the rendering commands are pro- 
cessed in the order they arrive. Some np^ratjnns that can 
be performed by HP s advanced graphics devices, such as 
the HP 98732A, can take a significant amount of time for 
X to perform. However, a critical Factor in the usability of 
a window system like X is the response time for operations 
such as window moves and creations. If the X server oper- 
ations must wait in line behind a long stream of cr impli- 
cated graphics primitives, the response time will not be 
acceptabl" 

Starbase/Xll Merge Locking Strategy 

Graphics driver software is closely coupled to the 
graphics hardware it supports. The driver routines set 
hardware registers to certain values and then drawing op- 
erations or other actions are started. In a multitasking en- 
viron men I such as HP-UX f there may be more than one 
process that includes a graphics driver that needs to access 
the display hardware, and one process may he preempted 
or swapped out at any time, even during the execution of 
driver procedures, To prevent indeterminate results arising 
from multiple processes using the graphics hardware to an 
uncontrolled way, there must he some means of restricting 
access to one process at a time. The permission (or token) 
to use the display must be passed from one process to 
another. 
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1. In stacked screens mode, the 01 her screen on the 
same device is referred to as this screen s 
"related screen." 

2. Since stacked screens implies sharing one piece 

of hardware, only one lock exisis in the HP-UX kernel 
so only one screen or the other can lock the device, 

3. If one screen of a stacked screens mode server 
takes the lock Irom another, the screen losing 

the lock can make no more assumptions about the 
hardware. Therefore, the old screen s rendering 
state is invalid. 



Fig, 7. Flowchart for the routine aosPrepareToRender which is 

used to handle locking within the X server. 
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One way this might be done is by implementing a token 
that the kernel controls. Only the process that has the loken 
won It j be allowed to access the graphics hardware, and all 
other processes would actually be prevented from access- 
ing the registers. This is not how the problem is solved in 
HP-UX, Instead, all processes are free to access the 
hardware, requiring thai a convention be established and 
followed to ensure that only one process gains access to 
the graphics display at one time. The HP-UX kernel helps 
in this matter by providing a token in the form of a software 
semaphore, and by blocking processes that request the 
semaphore while another holds it. Processes that do not 
follow the protocol of waiting to gain access to the token 
are not prevented from changing the hardware registers. 
The special kernel semaphore in the Starbase/Xll Merge 
system is often called the display lock or kernel lock, and 
it locks access to the physical display. 

X Server and DHA Processes 

Since the display lock is a system resource that processes 
contend for, it is a prime candidate for creating the classic 
deadlock problem. A typical deadlock problem was en- 
countered jind solved for a situation involving a Starbase 
DHA process and the X server. A Starbase process might 
gain access to the display lock not only to operate on the 
display hardware, but also to operate on shared memory 
structures associated with the display. In the course of its 
operations, it may need to call one of the standard HP 
extension X procedures to communicate with the server. 
When the server wakes up to service this request, as well 
as any other input il has received, it attempts to get the 
display lock, A deadlock occurs because the Starbase pro- 
cess is waiting for the server to respond, but the server is 
waiting for the display lock. 

To solve this and similar problems in the Starbase driv- 
ers, the calls to X procedures are strategically placed ouh 
side of code regions where the lock is held. An interesting 
example of this is the code to fetch a new window clip 
list. As long as a Starbase process running in a window 
does not hold the lock, the X server can process a requesl 
to change the clip list for the window. However, if the 
Starbase process gets the lock, then it cannot ask the server 
for the current clip list because of the deadlock that would 
resulL The code to solve this problem incorporates the 
following algorithm: 

while the clip list is out-of-date 

request a new clip list from the server 
get the display lock 

if the clip list thai was fetched is still up-to-date 
then exit the loop — go on to render 
else release the lock — go back around the 
loop again 
endwhile 

Locking within the X Server 

The X server typically processes requests from several 
clients for one or more windows each time it detects that 
there is input to be processed (a wakeup]. At some point 
during this processing, before the graphics hardware is 
accessed, the server process must obtain the display lock. 



All access tn I he hardware in the Starbase/Xll server Ls 
governed by the routine xosPrepareToRencterQ and its greatly 
simplified cousin xosLockDeviceQ. The duties of xosPrepareTo- 
RenderO are to verify ownership of or claim the display lock, 
remove cursors (Starbase or X) from the area to wmich the 
server is about to render, and ensure that the X server's 
and X display driver's concept of the current rendering 
state are the same. Fig. 7 summarizes the actions of xos- 
PrepareToRender xosl_ockDevice(), as its name implies, only 
performs the locking portion of xosPrepareToRender(), It is 
used when it is desired to lock the hardware but not change 
the display. 

In some places it is difficult for the X server software to 
determine whether the lock is already held. To handle the 
possibility of nested attempts to gain the display lock, each 
X display driver maintains a lock count. When the lock 
count (nesting level] reaches zero, the X display driver 
issues an unlock call to the graphics driver in the HP-UX 
kernel that maintains the semaphore for I he locked device. 
Immediately before unlocking the device, the X display 
driver resets the hardware and any software registers it 
might be maintaining to a state consistent with the expec- 
tations of other processes that might access the display, 
Under normal circumstances. I his reset is valuable. How- 
ever, in slacked screens mode the reset is disastrous. 

In stacked screens mode one physical display device is 
in.ide into tw r o screens and is opened as two separate de- 
vices. This causes the display driver lo maintain a separate 
lock count for each open. If either count goes to zero, the 
physical device will be reset and unlocked. A busy server 
is likely to render to both screens in a single waken p. so 
locking one half of a stacked screens mode server must 
imply locking the other half, Although the display lock is 
shared, rendering to one half of a stacked screens device 
invalidates whatever is known about the hardware state in 
the other half, Stacked screens mode is described in the 
article on page 3'A. 

Since claiming the lock on a device excludes other pro- 
cesses from access to that device, sharing the hardware 
requires that the lock he claimed at the last minule. The 
deferred lock claim avoids holding off direct hardware ac- 
cess clients any more than necessary. This requirement is 
especially critical when running with multiple physical 
screens. There is obviously no need to hold off direct 
hardware access to one screen while the X server is writing 
to another. 

Each X display driver provides an entry point called 
VaIidateRenderingState(). This routine ensures that the hard- 
ware, display driver, and server are consistent and set up 
for rendering. Calls to Va! i d ate Rendering State () can be very 
expensive, so care is taken to use it as little as possible. 
lit. usual reason for calling ValidateRenderingStateQ is that 
the hardware state is unknown or known to be invalid- For 
example, when the display driver releases the lock, the 
hardware is returned to its base state, so revalidation of 
the rendering stale is necessary upon claiming the lock. 

To minimize the number of times VafidateRenderingStateQ 
is called, the server keeps a pointer to the the last rendering 
state structure used for each screen. This pointer is set to 
null whenever the lock is surrendered, the cursor changes 
shape, color, or position, or the attributes of the window 
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change. Any of these changes means thai the contents of 
the rendering structure itself have changed. When xos- 
PrepareToftenden) is invoked, if the new rendering structure 
is the same qs the current one. the call to ValidateReo 
denngStateO may be skipped. 

Sharing Cursors 

In the effort to ensure that Starbase applications running 
in the X Window System have full functionality so that 
user programs can be used in the new environment without 
source code changes, one particular area of Starbase func- 
tionality that proved especially difficult was the implemen- 
tation of cursors in windows Starbase implements many 
different kinds of cursors, including crosshairs, rubberband 
boxes, and raster cursors. For a Starbase process to draw, 
it must remove the cursors that interfere with the window 
to be accessed, perform the rendering operations, and re- 
place the cursors. The same is true of the X server when 
it needs to render somewhere on the screen. The shared 
drivers effort described on page 7 allows much of the code 
that draws and undraws the cursor to be shared, but there 
is still a lot of logic that had to be carefully designed to 
ensure that the server and Starbase behave correctly in all 
situations. In Fig. 4 the data structure labeled 'Cursor 
Scne" contains a data block fcr &w h cursor. 

Cursor removal is complicated by the existence of both 
Starbase and X cursors. These two types of cursors have 
significant differences. Starbase cursors can have multiple 
instantiations — one window can contain more than one 
Starbase cursor, In the X environment only one X cursor 
Can exisl on the screen. Slarbase cursors also differ from 
X cursors in thai Slarbase cursors are < lipped to the win- 
dows containing them* Starbase cursors cannot extend Into 
the borders of their containing windows. The X cursor is 
a global entity in that it is never clipped and can extend 
through multiple windows and their borders- To ensure 
that cursor operations are consistent and predictable, all 
I he cursors in a window have a stacking order, arid no 
i UTSGT can he moved or operated on unless all the < ursors 
on top ol il have been removed. The X cursor is always on 
top. 

Because Starbase is allowed to gopen (open) a single win- 
dow many times it is possible for an X window T to have 
multiple Starbase cursor* in it. A mechanism was added 
to the Slarbase display drivers to maintain a list of active 
i insure for a particular window. This list, which IS labeled 
"Echo List" in Fig. 4, is located in (IRM shared memory. 
The list is traversed before the Starbase drivers do any 
rendering, and in the procedures associated with the XDI 
entry calls RemoveCursorQ and ReplaceCursor(), each active 
cursor in the list is removed in the order His found. When 
a Starbase cursor is activated. I he Slarbase driver adds it 
i.i i lie list, and when the Starbase cursor is deactivated or 
the program dies, the cursor is removed from I he list. 

The X display drivers also use functions associated wilfi 
the XDI entry points Re moveC ursors () and RepfaceCursors() to 
help the X server remove aod replace the X and Starbase 
cursors before and after rendering operations. Unlike the 
routines used by the Starbase drivers, these routines accept 
flags to perform selective removal of Starbase cursors, the 



X cursor, or both. This is accomplished without the X 
server's having to know very much about the cursors' rela- 
tive stacking order or other details. Once the X cursor is 
removed, it remains removed until the device is unlocked. 
The principal reason for not replacing the X cursor until 
the last minute is to avoid invalidating the current render- 
ing state. 

Removing cursors in the ui expen 

process, so c,i avoid unnecessary calls to the 

RemoveCursofsO routine. The server keeps a flag for each 
window to indicate whether cursors have been removed. 
Since the cursor removal code in the display drivers only 
removes cursors from visible ai sons 

must be removed before changing the clip list in those 
cases where the window situation is be - lied (e.g., 

indow is being moved or ieunified). The cursors are 
replaced using the new clip list, thereby drawing them into 
aay newlv exposed areas. 

Cursor removal is further complicated by Starbase cur- 
sors in reserved planes, On the SRX and TurboSRX display 
systems the fourth overlay plane can be used to hold Star- 
base cursors. The fourth plane is used for cursors by m 
the cursor color inlo the top eight entries of the color map. 
Whenever the fourth plane has a one in it. the cursor color 
will be displayed on the screen, allowing the cursor to be 
drawn in the overlays without destroying the color already 
there, Clearing the fourth plane restores the old color. Since 
these cursors need not be removed I i normal rendering, 
I he RemoveCursorsO routine typically does not remove ihem. 
In some situations, such as changing the stacking order of 
windows, moving windows, and so on. these cursors must 
be removed because their associated windows may become 
fully or partially obscured. These situations are handled 
ii\ ' Aching them and passing Ihe flag ALL_PLANES lo the 
X display driver when calling RemoveCursorsO. Of course 
use of die all. planes flag must he remembered ^> it can 
be passed to RepJaceCursorsl) when placing the cursors back 
on the screen. 

Sharing Fonts 

The fast alpha/font manager (FA/FM) system Is a utility 
pat kage ihal Starbase applications use to display rastej 
text. This proprietary systes* was originally developed for 
i In 1 IP Windows/9000 and Starbase graphics environment 
Being an early proprietary system, FA/FM could not take 
advantage of any of the work done by public domain 
terns such as X New development for FA/FM* such as the 
creation uf new fnols, had to be done by HP. 

During the Starbase'XVl Merge design phase, the design 
team saw the opportunity to remove J'A [Ms reliance on 
proprietary fonts and share the font files associated with 
the X Window System, The team sel ahoul designing a new 
font loading system that could be shared by both tin; X 
server and the FA/FM libraries. In addition, the FA/FM 
system was reengiueered to render with X fonts. There 
were good reasons to design the new system. By removing 
LA I Ms reliance on proprietary fouls and allowing F V f M 
to use the same font files as X. we anticipated that FA/FM 
would have a richer set of fonts to draw from. Whenever 
a new font was distributed for X, it could be used by FA FM 
as welh X fonts ;ire distributed In a format called Binary 
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Distribution Format, or RDF, This has become a de 
standard in the workstation marketplace. Font vendors typ- 
ically make their fonts available in BUF format. BDF fonts 
are usually translated by workstation vendors into Server 
Natural Format (SNF) for efficiency in storage and loading. 

We also saw ihe opportunity to conserve system re- 
sources. While the X server is running, offscreen memory 
and system RAM are heavily used. Therefore, it was de- 
cided that with proper design and engineering, wh could 
create a system that allowed both FA/FM and X not only 
to share font files, but also to share the actual fonts in 
virtual memory and offscreen memory, 

Tlie core of the font sharing system is the font [oader, 
Early in the design phase it was decided that the easiest 
way to share fonts w r as to write a sing]* 1 lout loading system 
that could be used by the X server and the FA/FM library. 
This shared loader's responsibility is to read the font file 
from disk into shared memory, making the font available 
to requesting processes. 

At the most basic level the font loader is quite simple. 
When a request is made to load a font, the font loader does 
the following; 

Locate and verify that the font file is a valid X font 
If font is in shared memory 
establish pointers 
return 
Else allocate the necessary virtual memory 
Create a shared memory object in the CRM's 

shared memory space 
Load the font's disk image into the shared 

memory 
Establish pointers 
Return 

The CRM shared memory object created by the font 
loader is the data structure labeled "Font Object" in Fig, 
4. As long as a particular foot remains loaded, any further 
requests to load this font will result in the loaders finding 
the font in shared memory because the same code is used 
by Starbase applications and the X server to load fonts. 
This ensures that at no time will there be more than one 
copy of a font in memory 

There were some additional requirements that had to be 
met for this new technology to be acceptable. 

■ Object Code Compatibility. Even though the font files 
used by FA/FM were changing, we had to ensure that 
programs that used the old technology would stilJ work. 

■ Relinked applications had to work. We had to ensure 
that relinking an application to use the new FA/FM font 
technology did not cause it to break, even though the 
application might contain absolute pathnames of fonts. 
The first requirement meant that whatever was done for 

the reengineered FA/FM system, the fonts that were cur- 
rently used by the old FA/FM system must remain where 
they were in the file system so that old object code refer- 
ences would still function. 

The second requirement could have been met easily if 
not for the first requirement. For example, if an object mod- 
ule that contained a request to load the font usr. lib. raster/6 x 8/ 
fp.SU was relinked, it had to be able to find a font that was 
in the X font format from this pathname, even though the 



exact file named contained the original FA/FM font file. 
To get around this problem and to satisfy t lie first require- 
ment, the directory structure used for FA FM fonts was 
modified. It was decided that any directory that had old- 
style FA/FM font files in it would have a subdirectory 
named SNF. This SNF directory would contain analogs to 
the FA/FM font files, but in the X font format. Fig. H shows 
the old and new directories formats. 

With this scheme, all of the old-format FA/FM font files 
can remain untouched, and the modified directory struc- 
ture satisfies the first requirement. 

To meet the second requirement, the method used by 
the font loader to find a font had to be expanded to accom- 
modate the new r directory structure. Instead of just accept- 
ing the pathname given it, it had to lie able to search a little 
further. Thus, the first step of the loader process was ex- 
panded to; 

Look at the name given 
If valid foul file, load it 
else 

insert "ASNF" into path 

look for valid font in this path 

if found, load it 

else error 

With the new font loader, fonts need to be loaded into 
memory only once no matter how many applications are 
using them. Backwards compatibility with the old FA/FM 
system is preserved. The X server and the FA/FM system 
now* share the source code to accomplish font loading, thus 
ensuring compatibility and reducing maintenance require- 
ments. 

Sharing the Color Map 

One of the recurring themes of the Starbase/Xll merge 
project was how to make X and Starbase share resources 
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Fig. 8. The old (a) and new (b) directory structures used to 
find and load FA/FM fonts. 
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that they previously believed they each controlled exclu* 
sjvely. One of these resources that had to be arbitrated was 
the color map and display controls like display enables 
and blink control. 

Notions of Color Map 

Th :*pt of a color map was modified quit* 1 a bit 

from Version rsion 11 of the X Window System, 

In Version 10 then _ r map that every client 

rom. When all of the colors were used, 
the alien! simply made do with what it had or exited. In 
Version 1 1, the concept of a virtual color map was designed. 
Multiple color maps can be created regardless of how many 
color maps the hardware can support. In fact, every window 
can have a different color map. The color maps get installed 
by a window manager according to some policy usually 
established by the user. This way, every window can use 
the entire range of colors that a particular display has to 
offer. Kii4- 9 illustrates I he concept of virtual color maps. 

SLirbase. on the other hand, maintains the single color 
map notion. Starbase is designed to believe that it is aJ ' 
running to a raw display device and that it has complete 
control over thr <l<\ U e. It also assumes that there is a single 
hardware color map and thai it writes directly to it. 

The Meeds 

The solution tor the X server sharing the color map with 
a Starbase application was simple. Every lime a, Starbase 
application opens a window and requests thai the color 
map be initialized, a new X color map is created for that 
window. In this way, Starbase applications that believe 
that they have complete control over the color map can 
run without modification, This solution easily takes care 
of the problem ol how to emulate a single hardware color 
map with exclusive access for a Starbase application. 

Tins solution dotis uoi answer the question of how Star- 
base applications can read from and write to the color map 
or how Starbase can share the color map with other X 
applications, The first option explored was to have Starbase 
use the standard X color map calls. There were a numbei 
of problems with this option. Starbase has a different notion 



than X does of how some color maps look, For example, 
in X it is possible to write only to the red bank of a particular 
color map entry. This js not true of Starbase. For example. 
for 24-bit displays. Starbase looks at t: 
si rig : ry array of RGB values that can only be writ- 

ten as tuples. X vii tme color map as three separate 

banks of color maps, representing the red, green, and blue 
banks of entr; 

There was also the question of performam u 
base applications use rapid alterations of the color map to 
achieve certain visual effects. Using the X color map mech- 
ead of X server communication might 
prove to be a bottleneck for performance. 

Finally, Starbase allows the manipulation of more than 
just the values of the color map, The shared memory % T ersion 
of the X color map includes additional attributes such as 
the display enable, color blinking, and color map blending. 
Also, inform. trent colors is included in 

overlay plane color maps None oi this information can be 
manipulated using the standard X color facilities. 

The Solution 

The design team agreed that Starbase's needs were 
beyond the capabilities provided by X and a new approach 
was needed. The approach finally agreed on was for each 
X color map to have an analog that the X and Starbase 
display drivers dealt with called a display state (see Fig. 
ini These display states are created in shared memory 
tims- .i new X color map is created, and they can be 
manipulated by X or directly by Starbase clients. As infor- 
mation is written to the display state by a display driver, 
the display state is checked to see if it is Installed in iln- 
hardware. If it is, then the hardware values and the display 
axe changed- If not. then only ihe software values in 
l lie display state are changed. 

Si net* I he display state is in the shared memory arid is 
managed by the graphics resource manager, Starbase appli- 
cations can now manipulate it. Now when a Starbase appli- 
cations opens a window and requests initialization, this 
driver performs the following operations: 
■ Create an X color map I this operation creates a display 





Virtual Color Maps (Contain 
Images of what Hardware 
Color Map Should Look Like) 




(Window Manager Controls 
which Color Map Is Loaded into 
the Hardware) 



One Hardware Color Map and Associated Registers 
(Contains Two Types of Information: 
Display Plane Information and 
the Color Table) 



Fig, 9, Virtual color maps. 
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Display 



X Server 




Starbase Application 





Display State (Shared Memory) 
Fig, 10, The architecture for sharing the color map in Star- 

slate in shared memqry). 

■ Associate the color map with Lite X window. 

■ Establish pointers to the shared memory display state 
data structure. 

■ Initialize the display state. 

Whenever a Starbase application makes changes to the 
display state, the Starbase driver does so direct ly^ not using 
the X color map routines, In this way, it is not slowed by 
the overhead of the X server communication mechanism. 
And since the Starbase driver creates its own color map. 
it assumes that it can do anything with it that could nor- 
mally be done with the hardware color map. 

When a window T is opened without the IN IT* flag, Starbase 
asks the X server which color map is associated with the 
wnndow. It then connects to the display state of that color 
map in shared memory. In this mode Starbase respects the 
restrictions placed on the color maps by the X server pro- 
tocol. For example h Starbase will not change a color map 
cell if the X server has marked it as read-only. The X server 
also does not allow a Starbase program to change the dis- 
play enable register. This allows a Starbase application to 
continue to use Starbase library calls lo modify the color 
map, but still cooperate fully with other X clients. 

Only one problem was left to resolve: how to communi- 
cate changes made by Starbase applications to the shared 
display state data structure to the X server? An X server 
extension called XHPSynchronizeGoiorRange was created to 
solve this problem. When a Starbase display driver alters 
the values in a display state, it then calls this extension 
routine. The X server then reads the current values of the 
display state and updates its notion of the color map's 
contents. 



*INIT is a standard flag used wuh gqpen maE implies cleanng of trie open disptav . 
and rhe mitjarizaiion of tfie cotof map 10 Starbase detaul: values 



Backing Store 

The backing store is a piece of memory where the con- 
tents of a window are backed up in case the window gets 
destroyed or obscured by some user action, such as hi mi lo- 
cation or resizing or by the action of another program. The 
X server supports backing store on a per-window basis. II 
an X client requests I he server to maintain a backing store 
for an window, the server will do so il possible, 

Fig. 11 illustrates the use ul hacking store in the standard 
X environment. The contents of a window and its backing 
store are shown in different frames. Assume initially I hat 
window A is completely visible and has a picture of an 
arrow on the screen. At this stage its backing store is empty 
(frame 1J. When window B is placed on top of window A. 
window A is obscured and the picture in the obscured 
region ts damaged. If window A was created with a backing 
store, the server will intervene before the damage takes 
place. When the server realizes that the surface of window 
A is going lo be encroached upon by some other window, 
the server saves the picture from window A to its backing 
store (frame 2). When window B is removed, the picture 
in the unobscured region oj the window A has to he updated 
(frame %% If window' A has a hacking store, the server copies 
tin! appropriate region from the hacking store and recreates 
the picture in window A (frame 4 I. 

If window A has no backing store, then I lie only way of 
updating the picture w T ould be to send an expose event 
notice to [he client owning window A. The expose i^voni 
tells the client that a region or regions of its window have 



Screen 



Baching Store 









Fig. 11. Views of a window and its backing store, in frame 

t wmdow A ts completely visible and backing store is empty, 
in frame 2 window A is partially obscured by window B In 
frame 3 window A is unobscured and part of the screen 

picture is missing, and in frame 4 the misstng part of the 
picture is copied from backing store to window A without 
intervention by the client 
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become exposed, that is. the picture contained in that reg- 
ion may have become inconsistent. If the client chooses 
to. it can update the picture by sending appropriate render- 
ing instructions to the server. In many graphics a ppli ca- 
tions, it ends up redrawing alt objects in the window even 
though only a small part of the window may have been 
damaged, 

For complex applications, redrawing the entire window 
is a time-consuming event. The standar rver does 

not guarantee that all implementations will support b 
ing store. The burden of redrawing a window is left to Ihe 
X clients. All X applications must be knowledgeable about 
expose events and must be able to deal with them. 

The Starbase Xl 1 server, however, must provide backing 
store support arid cannot depend on the clients' ability to 
deal with expose events, since the Starbase libraries and 
most Starbase applications have no notion of expose events 
and do not know how to handle them, A Starbase appi 
tion running in an X window would be unable to refresh 
a windovv after the window became un obscured, so I he X 
server must update it from the backing store. The X server 
not only must support backing store, it must also make the 
backing store a sharable object between its own display 
drivers and the Starbase display drivers. Therefore, the 
Starbase/Xll X server employs 'smart" rendering func- 
tions to share the backing store between the X and Starbase 
applications, 

HP support of the backing store capability in the X server 
dates back to the days of Version 10 of the X Window 
System. In the HP implementation of the X10 server in- 
capability was called the retained nister facility. 

The Starbase/XI 1 version of the X server operation of 
backing store was guided by two considerations: 
■ Operations involving backing store should be as fast as 

possible. 
1 In a window with barking store a pixel must never be 

rendered twice. If the pixel is in the visible portion of 

the window, it must be rendered on the screen: otherwise 

it will be rendered in the backing ■ 

Allocation Policy 

The backing store of a window is always of the same size. 
as the window it is backing up. The X server always tries 
to accommodate the backing store in the offscreen frame 
buffer. With the assistance of the display hardware, i 
ations on backing store resident in the offscreen frame hut- 
fer are as fast as I hose on the screen. However, the frame 
buffer is a limited resource, and there will be occasions 
when there will not be enough space in the frame buffei 
for a backing store operation. When this happens, the X 
server will place the backing store in the virtual memory. 

Direct hardware access windows (DHA windows] are 
shared between the X server and the Starbase application. 
If at any time In its life a DHA window is declared to be 
a barking store window, the X server will ask the graphics 
resource manager for a portion of offscreen memory large 
enough to fit the window. II none exists, the X server will 
ask the GRjM for a portion of shared memory so thai ln;lh 
the X server and the Starbase application can render to the 
shared backing store. However, like the frame buffer, shared 
memory is also a limited resource. Thus, there is no guaran- 



nat sufficient space will be available in the shared 
memory at the time the allocation request is made to the 

GRM, If the GRM cannot provide the needed amount of 
shared memory* I he server will declare the DHA window 
to have no backing store. 

windows are never provided with backing si 
MOMA windows employ transform engines in the 
hardware to accelerate their rendering performance. There 
is n> take advantage of the hardware transform 

engines to render to the backing store if the latter is in 
virtual memory. Since we cannot guarantee that the back ing 
store will be in offscreen : Tver does not 

support backing store for MOMA windows. Therefore, if a 
window with backing store becomes a MOMA win d 
the X server will dispose of its backing store. 

Smart Driver Functions 

The X server employs smart driver functions to render 
to its drawables. A drawabie is a two-dimensional window 
Of ti jiixmau that X and St i i\ draw on and to- 

ri single unit. These driver functions are called smart be- 
cause they can distinguish between different types of draw- 
ables, such as windows without backing store, windows 
with backing store in frame buffer, windows with backing 
store in virtual memory, and pixmaps in virtual memory. 

When a smart driver function is called to render to a 
window, the function can determine whether the window 
has a backing store. If the window has a backing Store the 
i i to determine the location of ihe backlog store, 
which can be in llie frame buffer, virtual memory, or GRM 
shared memory. Further, the driver can figure out which 
parts of the backing store represent obscured regions of the 
window. With this knowledge, the smart functions render 
the necessary pixels either on the screen or in the backing 
Store. It is never necessary to render to a pixel lv 

To make backing store shamble between X and b DHA 

Starbase Client, Ihe X server HP extension XHPRegisterWire 
dow() is used to create the backing store object shown in 
Eig f 4. The following information is contained in this object; 
m Drawabie Type (drawabie., type 1. An integer flag represent- 
ing the backing store attributes of the v\ mikm "[ 'he val- 
ues ii id ii ate whether the window has backing store and 
w he I her ilis located in I lie offscreen frame buffer mem- 
ory, virtual memory, or GRM shared memory. 

■ Backing Store Stamp (bs_stamp)- An integer counter that 
is incremented whenever the state of the window's hack- 
ing store changes. This is a trigger to the client thai it 
needs lo obtain new backing store information from the 
shared memory object, 

■ Shared Memory Offset (sm.offseij. A pointer to the start 
of backing store if it is located in shared memory. The 
value of this pointer is an offset relative to the beginning 
of the shared memory segment. The client must add its 
own shared memory base address to determine the true 
absolute address. 

■ Shared Memory Stride (sm .stride). An integer xnlut- rep- 
resenting the width of Ihe shared memory backing Store 
pixmap in bytes 

■ Backing Store X Offset tbs_otfset_x]. An integer value rep- 
resenting the frame buffer X offset of backing store if it 
is in frame buffer offscreen memory, 
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■ Backing Store Y Offset (bs_offset yj. An integer value rep- 
resenting the frame buffer y offset of backing store if it 
is in frame buffer offscreen memory. 

■ Barkiii.u, Store Planes |bs_planes). An inh^«er bit mask rep- 
resenting the display bit planes that are managed by 
backing store. 

■ Backing Store Pixel (bs.prxel). An integer representing the 
value to be placed in the bit planes not managed by 
backing store 

Deep Backing Store 

Starbase supports 24 -plane deep windows. The iv due, it 
was necessary to develop a method for the X server to 
support a 24-bit-per-pixel backing store. The main problem 
was determining how deep backing stores can be organized. 
In 2 4-p lane-deep displays, the frame buffer is organized as 
three memory banks, each eight planes deep, The three 
banks are the red, green, and blue banks. As long as the 
backing store is placed in the frame buffer t there is no 
prnblem. The RGB components of each pixel are stored in 
I he i corresponding bank. There is a problem, however, 
when the backing store must he placed in virtual or shared 
memory. 

In the X server, rendering to the virtual memory is done 
using the memory drivers leveraged from the Slarhase li- 
brary. There are two main components of the memory 
driver: the bit driver and the byte driver. The bit driver is 
US0d to draw on one-bil-per-pixel virtual memory pixmaps. 
and the byte-driver is used fof one byte-per-pixel virtual 
memory pixmaps. In implementing the deep backing store 
we could have written a new memory driver for drawing 
to 24-bil-per-pixel virtual memory pixmaps or organized 
the deep backing store so that the existing memory drivers 
could be used without any modification- We chose the 
latter solution (see Fig. lz)> 

The organization of the deep virtual memory backing 
store mirrors that of the deep frame buffer. The deep virtual 
memory backing store is organized as three software hanks, 
each one byte deep, corresponding to RGB banks hi the 
hardware (see Fig. 12e). With this organization we are able 
to use the byte drivers without any change, However, for 
each drawing operation w r e call the byte driver three 
times — once for each software bank. This organization also 
simplifies I fie process ot copying data from the virtual 
memory backing store to the screen because the data from 
a softw T are bank is simply moved to the corresponding 
hardware bank. 



(a) 




24 Planes Deep 



8 Planes Deep 
Red Bank Green Bank Blue Bank -for Each Bank 
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8 Planes Deep 



Software Red 



Software Green 



Software Blue 



Fig. 12. (a) A 24* plane-deep window on the screen Of 
course the physical depth of display memory is not seen by 

the user, (b) 24-pfane-deep backing store m offscreen frame 
buffer memory organized in three hardware banks of 8 planes 
each. The picture on the display is replicated on the three 
banks, (c) 24-plane-deep backing store in virtuai memory. 
This is a coniigucus piece of memory organized tn three 
compartments. Each compartment is a software bank mirror- 
ing the hardware hanks 
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Sharing Overlay and Image Planes 
in the Starbase/X11 Merge System 

Developing a method to take full advantage of the 
capabilities of display memory was one of the challenges 
of the Star base/ X1 1 Merge project. 

by Steven P. Hiebert John J, Lang, and Keith A. Marchington 



DEPENDING ON THE DISPLAY DEVICE, the X server 
allows users to configure a display in four fu 
mental display modes: image mode* overlay mode* 
stacked mode, and combined mode (see Fig, I). The display 
mode determines how the hardware display memory is 
used. This article describes the rationale for the different 
display modes and how each of them works. The combined 
mode is discussed in greater detail than the others because 
it is the mosl sophisticated mode and it is available on the 
TurboSRX 3D graphics accelerator display system, 

I IP offers a wide variety of display hardware for lis work- 
station products. This display hardware ranges bom low- 
resolution monochrome displays to high-resolution dis~ 
plays with 16 million colors and 3D acceleration hardware* 
Using the full range of display capability in the display 
hardware was one of the challenges for the Starbase'Xll 
Merge design team. 

One of the underlying philosophies of the X Window 
System is that it provides the tools to build different user 
interfaces, but it does not enforce any particular user ; 
late standard. Tims X provides mechanisms, not policy. 
To maintain this philosophy, it was decided that the X 
server would provide the different di.ml.c. modes foi the 
X Window System and allow the user Indmnse the display 



mode most appropriate for the application. 

Overlay and Image Plant 

All display systems For HP's workstations have at least 
one ami as many as 24 planes of display memory. In addi- 
ct une of the more sophisticated display systems have 
additional display memory called overlay planes 1 he 
lay planes are so named because they appear on top of, or 
over, the image planes. For example, if the overlay planes 
of a display are enabled and ^ach pixel is set to black, then 
the Image planes would not he visible. Areas of the overlay 
planes must be disabled or made transparent to view the 
image p Janes. Overlay planes can be set to a transparent 
color so chat the image planes can be seen. Existing Hi' 
displays have from zero to four overlay planes. 

The image planes are used primarily for rendering com- 
plex images and Usually hatte more capabilities than over- 
lay planes. For example, on (he TurboSRX display vstem, 
the 3D m i elerator call clip to an arbitrary set of rectangles 
in the Image planes, hut not in the overlay planes* Overlay 
planes have a number of uses, but primarily they are used 
to display information like text and menus. In (his way 




8 Planes 

for Each Color 




4 Planes 



(a) Image Mode 



(b) Overlay Mode 





(c) Stacked Screens Mode 



(d) Combined Mode 



Fig, 1 , An illustration ot the differ- 
ent display modes (a) image 
mode At! rendenng by X is c 
only to the image planes of the 
display (b) Overlay mode. Ail /, 
rendering is done only to the over- 
lay planes, and the image planes 
can be used by other applications 
To see what ts on the image planes 
the overlay planes would have to 
be made transparent (o) Stacked 
screens mode. The overlay and 
image planes are treated as two 
separate screens, (d) Combined 
mode implemented primarily to 
support the display capabilities of 
TurboSRX \ the combined mode 
uses the overlay and image 
planes as one screen. 
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rendering in the image planes is not damaged by menus 
or text, ami costly redraws of pi< tures in the image planes 
are prevented. With some 313 graphli s m 1 1 implicated 2D 
graphics, such redraws C8U take many minutes. 

The overlay and image planes are located in the frame 
buffer and. as shown In Fig, 2, each plant- is organized inln 
on-screen and offscreen memory. 

Image Mode 

Hvery HP display system supports the image mode and 
ill but die TurboSRX will default to image mode if the user 
dues not specify a display mode, In the image mode, I he 
X server performs all rendering only on the image pla 
available on the disphi\ device. Ji J he display device has 
any overlay planes they are set to transparent in this [node. 
See Fig. la. 

Overlay Mode 

The overlay mode is almost identical to the image mode T 
except that the overlay planes of I he display device are 
used by X rendering calls, and the image planes axe free 
to be used by other applications such as Star-base graphics 
applications. A good example of this configuration is the 
HP 9000 Series 300 and SOU SRX (solids rendering acceler- 
ation) display system. The 3D acceleration hardware of the 
SRX is nol capable of clipping to window boundaries, so 
it is not useful to a window environment. For the M3 accel- 
eration hardware to be useful, it must have unobstructed 
access to the full, unobscured image planes. To run with 
this hardware configuration, a window-based application 
can provide all user interface components (e,g,. windows 
and menus) in I lie overlay planes using the X display driver, 
and use the !iD accelerator for more complex rendering in 
the image planes. By creating a transparent window in the 
overlay plane, or by setting the window system's root win- 
dow to transparent, the image planes can be made viewable. 
On the SRX display, this is the onlv way to use the 3D 
graphics accelerator and a window system such as X at the 
same time. 

Stacked Screens Mode 

In the slacked screens mode the overlay planes are used 
as one screen and (he image planes as another (see KEg. 
lc). In this way T the window system has twice as much 
screen "real estate." Stacked screens mode is literally the 
image mode and overlay mode running simultaneously. 
The screens are stacked one on top of the other with the 
visible screen being the one where the mouse cursor is 
located. To get from one screen to the other, the user simply 
moves the mouse off the edge of the current screen. The 
other screen is made visible as the mouse enters it. All of 
the normal capabilities of X are available in both the image 
and the overlay screens, and all of the restrictions of the 
image and overlay modes apply. 

Stacked screens mode is particularly popular with soft- 
ware developers because it is possible to make twice as 
much information easily viewable. This means that a de- 
veloper can have a debugger, terminal emulators, editors, 
code viewers and other applications all running at the same 
time and viewable. 



Combined Made 

Image, overlay, and slacked screens modes were avail - 
able in the X Window System before the Starbase/Xll 
Merge project. However, the Starbase/Xl 1 Merge project's 
goal was to provide full-performance Slarbase graphics in 
X windows wherever possible, and since the TurboSRX 
display, which is the successor to the SRX display system, 
has the hardware necessary to do accelerated graphics in 
Windows, this meant that we needed to provide accelerated 
graphics in windows as well. This could have been done 
in image mode on the TurboSRX, but it would not have 
been as aesthetically pleasing. 

The design team decided that a new approach was 
needed for the TurboSRX. This new approach is called the 
combined mode. The combined mode uses all of the planes 
nf the display system (both image and overlay] as a single 
s< t&jaxt, making it look to the application as if there were 
simply one contiguous set of planes w f ith a variety of differ- 
ent capabilities (see Fig, Id). Using both the overlay and 
1 lie image planes as a single screen is basically the opposite 
of how stacked mode w r orks. In stacked mode the image 
and overlay planes are treated as two separate screens. 
With the combined mode the capabilities of the TurboSRX 
and X can work together, 

TurboSRX Capabilities 

Many of the capabilities provided by the HP 9000 Series 
300 and 800 TurboSRX graphics subsystem are also provided 
by its predecessor, the SRX, These capabilities include: 

■ Image Planes, There can be 8 to 24 planes of image mem- 
ory plugged into the display system. The system can be 
used as an eight-bit pseudocolor device (CMAP_NORMAL 
mode) offering 256 colors simultaneously or as a 24-bit 
color device (CMAP, FULL mode) offering over 16 million 
colors simultaneously, 

■ Overlay Planes. Each display system has three or four 
planes of memory that overlay (or are in front of J any 
other display memory. The original intention for these 
planes w r as to use them for floating text, cursors, or 
menus. 

■ Double Buffering. The image planes can be partitioned 
as pairs of banks in a variety of ways for double buffering. 
The most common configurations are to divide them into 
two eight-bit banks in CMAPJMORMAL mode and into two 
12-bit banks in CMAP_FULL mode, 

i Color Map Mode Hardware, The color map mode hard- 
ware enables the display system to run either in the 
CMAP.NGRMAL mode or the CMAP^FULL mode, if 24 planes 
of image memory are plugged into the display system, 
in CMAPJMORMAL mode each pixel is interpreted by tak- 
ing the eight-bit pixel value out of the low bank of display 
memory and using it as an index into a table of RGB 
L green, blue] values to determine what color a par- 
ticular pixel on the display should be. In CMAP_FULL 
mode, each of the three eight-bit banks of display mem- 
ory is read to determine which red, green, and blue value 
should be used on the display, By writing to a hardware 
mode register, these modes can be dynamically switched 
and different windows on the display screen can be dis- 
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played in different color map modes. 

■ 3D Graphics Hardware. Both systems have the ability to 
render complex 3D graphics, providing realistic images 
on the display. The front cover of this issue shows an 

imp]e of the realistic images that ran be produced 
jig combined mode on a TurboSRX display system. 
The 3D imaa _ md gears) are located in the 

image plane, and the other items on the display are lo- 
cated in tb- plane. 

GapabiUtiesj available in the TurboSRX but not \i 
include: 

■ Hardware Cursor. Two planes of memory (in addition 
to the overlay and image planes iable for the 
display ol cursors. This feature allows a hardware cursor 
to be placed on the display without disturbing the con- 
tents of any of the image or overlay planes beneath it. 
The hardware cursor also offers the advantage of not 
having to remove the cursor to render, since it resides 
in ils own plane of display memory* Not removing the 
cursor before rendering provides better performance for 
rendering routir 

■ MO MA Window Support. From the perspective of the 
Starbase/Xll Merge system, this is probably (he most 
significant feature on TurboSRX. MOM A [multiple, 
ohscurable. movable, accelerated] window support al- 
lows the TurboSRX accelerated graphics capabilities to 
be used In a windowed environment by providing spe- 
cial clipping hardware. This clipping hardware allows 
the TurboSRX graphics accelerator to render only lo the 
exposed rectangles of a window. The TurboSRX hard- 
ware has support for a maximum of \Yl clipping rectan- 
gles for MOM A windows, which is an adequate number 
for most window systems, bit a small number for the X 
Window System, 

With these TurboSRX features in mind, the design team 

focused on designing the Starbase/Xll Merge system to 

take frill advantage <A the hardware capabilities of the Tur- 

X. This n is id ted in the following design goals for the 

combined display mode: 

■ Provide support for MOM A windows that would allow 
Starbasf applications to use the 3D graphics acceleralor 
in X windows. 

■ Support eight-bit and 24-bit color modes. Make &-bi1 
pseudocolor and 24-bit color with double buffei ing avail- 
able to applii ations 

■ Maintain the visual aesthetics of the system. When pos- 
sible, minimize the damage 1 1 s 1 1 different hardware 
modes and different color maps cau.se to E he appearance 
of the display when they are displayed simultaneously, 

* Provide a state-of-the-art X server implementation, Rec- 
oncile the capabilities of the X Window System, Version 
l ] willi the capabilities of TurboSRX 

The Architecture 

With XI 1. a number of new concepts were introduced 
to increase the capabilities of X such that it could be run 
on the entire range of lodavs display hardware as well as 
any future display hardware that mighl hi^ developed. The 
concept in XI 1 Ihal is most important to Ihe combined 
mode is called the "visual." The visual is the mechanism 
X uses to describe the capabilities of a particular display's 



hardware. The visual structure includes: 

■ Class, The class describes how a color is mapped from 
memory to the display, There are two major classes, 
static and dynamic, and subclasses of each. The subclass- 
nclude gray* mapped color, and decomposed color- 
Static and dynamic classes are defined at X server start- 
up time. Static: classes cannot be changed by application 
progra m s . b u t d y nam ice! i e f ioab 1 e a n d cha n ge- 

ablein the application program. The gray si; ans 

that all the colors in the color map are shades of gray. 
For the mapped color subclass, one-byte pixel values 
from the frame buffer are used to index into a color map 
of RGB tuples which describe the color to be displayed 
(see Fh; ihe decomposed color subclass, a fl 

byte pixel value is used to get trv due from the 

color map. The first byte is used for red, the second byte 
for green, and the last byte for blue (see Fig. 3b). The 
mapped i ulor subclass allows ujp fed 256 colors and the 
decomposed subclass allows up to 16 mi 111 on colors. 
Each entry in the color map table represents a color 
intensity (shade |. For instance, the value 10 might repre- 
sent dim RGB intensities and 22U would represent bright 
RGB intensities. These red. green, and blue intensities 
are mixed together to produce the displayed color. Put- 
Mng these attributes of color maps together (class and 
suhclass] allows the device lo support up to six types of 
color maps. Table I shows the X color map types. 



Table 1 
X Color Map Types 



Subclass 



Static 



Dynamii 



tiray StaticG r ay Grayscale 

Mapped Color StatrcColor Pseudocolor 

Decomposed Go lor TrueColor DirectCoIor 

Color map entries. The number ol different color map 
entries available for use by client applications. 
Bits of RGB m (urination. How mam bits of resolution 
are available to describe red » green, and blue color vail h 



Image Planes 



\ 



Overlay Planes 




(a) 



<b) 



Offscreen Memory 

On -Screen (Visible) Memory 

Fig- 2* The organization of the /mage planes m the frame 

buffer, (a) A display system containing both image and 
lay planes (e g . the HP 9B55QA Color Graphics Board) (b) 
A display system with only image planes in the frame buffer 
{eg , the HP 9854? A} 
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Three- 
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( b ) Color Map 

Fig. 3. (<&) Mapped color subclass A one- byte pixel value ts 
used fo im&fr into the color map to obtain the RGB tuple, (b) 
Decomposed color subclass, A three -byte pixel value con- 
tains an sndex for each primary color It si in the color map 

■ Planes. The number of planes of display memory avail- 
able on the display device, 

Xll makes it possible to have more than one of these 
visuals available on a given screen at the same time. With 
multiple visuals, it is possible to create a mode that incor- 
porates the capabilities of both the image and the overlay 
planes of the TurboSRX so that the full range of the dis- 
play's capabilities are available to applications. We decided 
to treat the image and overlay planes as a single screen 
with the overlay planes represented by one three-or-four- 
plane PseudoColor visual type. The number of planes is de- 
pendent on how the user sets up the device file fur ihem. 
The image planes, with their CMAP_NORMAL and CMAP_FULL 
modes, are allowed to have either an eight-bit PseudoColor 
visual type, a 24-bit DirectColor visual type, or both simul- 
taneously. Another option allows an eight- bit double-buf- 
fered PseudoColor visual lype for image planes and a 12-bit 
double-buffered DirectColor visual type for image planes. 

In combined mode, the root window for the screen al- 
ways resides in the Overlay planes, and the overlay plane 
visual is the default visual for the screen. Any C lien I that 
simply asks for a window to be created with the default 
visual of the screen ends up residing in the overlay planes. 
For an application to create a window in the image planes, 
it has to request the visual information from the server and 



specifically request the desired visual lype. 

The i I'Jnr map modes CMAP NORMAL and CMAP.FULL ici 
the image planes are handled through virtual color maps. 
Virtual co)ur maps are an image of what the window or 
client thinks the hnrdware color map looks like. A 
described in the article on shared display resources on 
page 20, each color map in the Starbase/Xll Merge system 
has an analog called a display state, which is used hy lhe 
display drivers, Each display state contains the CUffeal 
color values for a device's color map, some device-specific 
Information abouJ which planes of the display are enabled, 
and In the case of the TurboSRX, the color map mode of 
the hardware. X provides a way for a program to control 
which color map is currently loaded into the hardware 
(this is called validating the color map). Usually a spec in I 
X client, such as a window^ manager, is the only program 
that changes which color map is loaded [validated]. The 
window manager may have several methods for the user 
to specify which color map is loaded. Therefore, when the 
color map lor an eight-bit PseudoCcolor window is installed 
in &e image planes, the hardware will he switched toCMAP_ 
NORMAL mode, and when the color map for a 24-bit Di- 
rectColor window is installed. Lhe hardware will [> lf switched 
to CMAP.FULL mode. Fig. 9 on page 29 illustrates the virtual 
color map concept 

The result of this approach is that most windows are 
created in the overlay planes. Most X server clients such 
as window managers and terminal emulators use the de- 
fault visual, Applications that request visual types that are 
in the image planes can change the color map in the image 
planes and use one of the color map modes wilhout affect- 
ing the visual appearance of the windows in the overlay 
planes. Most of this color map control was provided for 
Starbase applications because they usually assume that 
they can change lhe color map at will. As a result a Starbase 
application creates its own virtual memory color map for 
a window 7 that it opens, 

This design allows the TurboSRX to be used in windows 
and satisfies all of the design goals for the TurboSRX dis- 
play driver in the X server With most of the windows in 
the overlay planes, their clipping regions do not have to 
be included in the hardware clip list for the accelerator, 
This helps us live with the 32-clip-rectangle restriction of 
the TurboSRX and achieve the full performance of a Star- 
base application running in X. 

Having most of the commonly used windows in the over- 
lay planes allows combined mode to maintain visual aes- 
thetics at the highest possible level, while still allowing 
both eight-bit and 24-bit windows in the image planes. As 
a counterexample, take the case of image mode. If image 
mode were to allow both eight-hit and 24-bit windows 
simultaneously, one of those two visual types would have 
to be the default. If an application created a window of a 
visual type i it her than the default and its display state were 
installed, it would change the hardware color map mode 
and all of the windows of the default visual type would 
become incorrect in appearance, in fact, the windows, in- 
cluding the root window, would become completely inde- 
cipherable. However, with combined mode, when the 
hardware color map changes, the windows in the overlay 
plane (where most applications reside) remain visuallv 
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Overfay Plane 
Window on Top 



I = Window in Image Plane 
O = Window in Overlay Pfane 



New Stacking Order 
image Plane 
Window on Top 



Fig. 4. When the siackmg order ts changed area X becomes 
the newly exposed area in the tmage plane In the original 
clipping algorithm, besides causing area X to be painted 
transparent in the overlay plane, the image plane ts consid- 
ered to be damaged, causing the tmage plane to be cleared 
to the background color and an exposure event sent to the 
chent owning window I The client would then rerender area 
;e plane 

correct .mil only image plane windows become visually 
incorrect. 

Tins design also provides a very straightforward view 
for an X application. A client application i an simply con- 
nect to the server and request windows of the default type 
and get windows in the overlay planes. Or, using the XGet- 
Vjsuaiinfo routine, the client application can interrogate the 
server lor all of its visuals or a jurru ular visual it is in- 
terested in. The application never worries whether it is in 
overlay planes of image planes, The server automatically 
[daces the window in the appropriate planes without Into -r 
vention bj the application. 

implementation 

The architecture described above fits very neatly into 
the the X model, and for the most part, the implementation 
of combined mode was straightforward. But there were 
some challenges in the implementation that resulted in 
some interesting solutions. The two most challenging areas 
were how to dlow the user to -•■■" through the overlay plane 
to the Image [plane windows and how to clip windows and 
generate i ir only those areas of windows thai 

were actually damaged by other windows. A window thai 
needs exposure is one that is covered up and needs to be 
seen. To see a window that resides in the image planes, 
theoverlav planes must be made transparent* At first, i 
ing this transparent hole seemed like a difficult task, hut 
as it turned out, Ihe X server architecture allowed this to 
be handled quite easily Whenever an area of a window is 
exposed, the server is required to paint the window's back- 
ground. At this point, the X server determines if the win- 
dow being painted is in the image planes, and if it is, simply 
makes the same area of the overlay planes transparent. In 
this way all visible regions oi the image plane window 
have a corresponding area in the overlay planes painted a 
transparent color 



Combined Mode Clipping 

To solve the problem of dipping windows and generating 
s for damaged windows, and to make full use of 
the capabilities of the TurboSRX hardware, the clipping 
algorithm used in rver had to be modified. In the 

original X server, the clipping algorithm made no distlnc- 
rlay and image planes when computing 
clip lists for windows. Lacking this distinction, 
window in the overlay planes would cans* 
conclude that any windows in the obscured 

by the overlay plane window were damage i the 

overlay plane window was moved or dm tewly ex- 

i the image plane window would be cl- 
to the windows background color and an exposure event 
would he sent to the client owning the image plane win- 
dow. The exposure event tells the client that it must re- 
render to the image plane [see Fig. 4). The modification ol 
the clipping algorithm allows windows in the overlay 
p Lines lo be created and destroyed without affecting win- 
dows in the image planes. 

For both clipping algorithms* new clip lists arn computed 
whenever an action is taken that could change the clip list 
fe,g.. changing the stacking order of the windows on the 
screen). The function xosVahdateTreeO is used to compute 
the new clip lists. xosValidateTreeO adds the visible portions 
of any children of the parent window to be reclipped 
into the parent window's clip list and then passing the 
parent s clip lis! as the working universe, calls the routine 
xosComputeCUpsQ to let each of ihe parent wmduw's children, 
and the children's children and so on recompute their clip 
lists. The working universe includes the visible areas of 
the parent window. Upon return from xosCornputeCNpsQ, the 
..Hiking universe is the parent's new clip list. By subtrat t- 
iug the old clip list from the new clip List the parenl i ,\\\ 
compute which areas have been newly exposed Thai is, 
inv area in the new clip list that is not in the old dip list 
must be newly exposed. 

The modification of the clipping algorithm to support 
combined mode consists mainly of compel ing I wo clip 
hi iill the windows on the screen, One sel of clip lists, 
which we can call the old-style clip lists, is generated based 
mi the unmodified « h pp a ithm des< ribed above [i.e., 

h;. lists contain windows from both the image and 
the overlay planes). The second set of c dip lists is computed 
taking only the image plane won lows into account (image- 
only clip list). Within the X server, image plane windows 
use the image only clip list as the default clip list, and the 
overlay planes use the old-style clip list as the default. 
Both ire,:. iverlay plane windows use tin? old-style 

clip list foi cursor removal. Since eithei type pi window 
• ,m have children or subwindows of the other type, 
dows on both planes musl keep the image-only and old- 
si vie clip lists available. 

In the new combined mode algorithm, rendering to I he 
image plane is done only when there are changes to the 
windows in that plane and tint because of changes to Win- 
dows in the overlav plane. The image-only clip list IS used 
i i handle rendering to image plane windows The old-style 
clip list is used to determine which areas of the overlay 
plane windows must be painted transparent to expose win- 
dows In ihe image plane. 
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Combined mode clipping allows rendering to an image 
plane window while it is obscured by an overlay plane 
window. Since the rout window is always in ihe overlay 
planes, rendering can even take place to an image plane 
willow that is iron i lied. The server must take i .art'., how- 
ever, to avoid rendering to areas of iconified imsge plane 
windows used by other image plane windows, [mage win- 
dows that are not iconified die automatically removed from 
the allowable rendering area by Hie old clipping fftethcd 
Extra programming was required to remove iconified Image 
plane windows from ihe allowable rendering areas of other 
iconified image plane windows. That is, if two iconified 
image plane windows overlap, neither may render to fl*e 



overlapping area. When one or the other of the iconified 
windows is mapped, it will get an exposure event for that 
Overlapping area. 

Conclusion 

Combined mode is a solution to ihe complex problem 
of how to support a high-end display system in the best 
possible way. Combined mode offers some capabilities that 
aliow the TurboSRX display system 1o work at its foil po- 
tential in an X environment. With the addition of Combined 
mode, the X server now offers four different display modes 
of operation to take full advantage of the broad range of 
display hardware lor HP workstations, 



Sharing Input Devices in the Starbase/X1 1 
Merge System 

To provide support for the full set of HP input devices and 
to provide access to these devices for Starbase 
applications running in the X environment, extensions were 
added to the X core input devices: the keyboard and the 
pointer, 

by Ian A. Elliott and George M. Sachs 



STANDARD X SKRVERS SUPPORT two input de- 
vices: the pointer [mouse, tablet, light pen T etc.}and 
the keyboard. These devices are known as the core 
input devices. The X server sends information from the 
input devices to client programs in packets nailed "events/ 1 
The keyboard generates key eventSi while the pointer gen- 
erates bntton or motion events. These events contain infor- 
mation that includes the absolute location in two dimen- 
sions where the event occurred, the location relative to the 
X window* in which the event occurred, and a timestamp. 
For key and button events, there is also a field that tells 
which key or button was pressed. 

In a typical X environment, multiple application pro- 
grams called clients run simultaneously, Each has its own 
window r or set of windows and all share the core input 
devices, The X server arbitrates which client gets a particu- 
lar input event by determining which window has the 
'input focus/ 1 The focus window, which is the window 
that is allowed to receive input from input devices, is nor- 
mally either the smallest window that contains the pointer, 
or is an arbitrary window explicitly established as the focus 
window by a protocol request made by a client program. 
We faced two major problems in the area of input device 
support for Starbase/Xll Merge: how to provide the ability 
to use the full set of Hewlett-Packard input devices in an 



X environment, and how to access those devices through 
Starbase in that environment, The first problem arose be- 
cause there is currently no X standard for using other input 
devices in addition to the core devices. If additional devices 
were supported, there is no provision wnthin the defined 
Core events for determining w r hich device generated the 
event. There is also no provision in the existing events lor 
reporting data of more than twd dimensions, or motion 
data whose resolution is differenl from that of the screen. 
The problem with Starbase was that prior to this project, 
Starbase did not provide a way for multiple programs to 
share input devices. The only input devices that could be 
shared were those for which a window system arbitrated 
the sharing and allowed Starbase input, These devices in- 
cluded the HP Windows/9000 locator and the X Version 
10 pointer and keyboard. 

To overcome these problems the goals established to pro- 
\ ide sharing of input devices in the Starbase/Xll Merge 
system included; 

■ Support a wider range of input devices including the 
core devices, and ensure that all the devices supported 
have the same functionality as that provided by the core 
devices, 

■ Support all input devices that follovv the HP-HIL (Hew- 
lett-Packard Human Interface Link) specification 1 and 
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X Input Protocol and X Input Extensions 



Tne core protocol of the X Wmc \ standard 

synta.' 
the sequence of bytes that make he protocol requests 

■ xSettnptitFocws reque allows a c 

choose which wmoow should receiv ^oara, 

has" nai 



(bytes) 
i 

1 or2 
2 3 

4 o 1 or a Window ID 
4 "nnnestamp Informal ion 



Meaning 

XSetlnputFoais Request ED 
Revert -so- Window Parameter 

Request Length (mfour-Dyte 

words J 
Focus- window Paramet 

Focus-Time Parameter 



The information in a protocol request like the one above tells 
what request js being made (XSeiinpui Focus) the length of the 
request Itfiree four -byte words, or 12 bytes), and the values ol 
any parameters the request has The parameters in the request 
specify which window should receive input from the keyboard 
(the Focus-Wmaow parameter), which window should receive input 
if the focus window disappears (FtevwMo* Window parameter), and 
when the XSetinputFocus request shouio take effect \ Focus-Time 
parameter), The 0, 1 and 2 values m the parameters are special 
constants that indicate no tfincteaA whichever window contains 
the X pointer and whichever window was named as the parent 
of the focus window respectively. 



X was de vendors such as Hewlett- 

Packard to extend the protocol by d ew request 

can be interpreted by X servers 

requests For example, ihe HP a pro- 

tocol request named XHPSetDewceFoeus This request a 
client program to choose 

from s in the keyboard or mouse The 

est has the following format 



Length Value 
(bytes) 

1 128 ^Number ^255 

1 8 

2 5 



Meantng 



ID of HP Input Extensfon 
XHFSeiDsvpceFocys Request ID 
Request Length (in four-byte 
words) 
4 0. 1 , or a Window ID Focus- Window Parameter 

4 Device Identifier Focus-Device Parameter 

4 Timest amp Information Focus-Time Parameter 
1 , 1 , or 2 Revert-to- Window Paramet e r 

3 Unused Bytes 

The request begins with a number that identifies the extension 

that implements the request and distinguishes the request from 
core protocol requests The next byte identifies the request within 
the extension The length, Focus- Window, Focus-Time and Revert-to- 
Wmdow parameters serve the same purpose as they do for the 
x Set In put Focus request described above. The Focus-Device param- 
eter ideniifies the input device for which the client program mak- 
ing the request wishes to control the destination of the input 



are supported by the HP-UX operating syshm 

■ Allow Ihe choice cd Ihe core devices to be easily config- 
ured and provide reasonable defaults if tie i hoice is 
made 

For Starb&se applications the following additional goals 
were established; 

■ Provide full functionality for Slarbase applif ations using 
input devices in an X window. 

■ Ensure thai the design does nut require source code 
changes In the Starbase application, except for the pos- 
sible exception of the call to the gopen function which 
is used to open an input device. 

■ Allow multiple programs to access and share the same 
Input devii r- simultaneously, 

HP-HIL Input Devices 

HP-HIL input devices are grouped into three general 
categories by the Starbase/Xll server. First, there are 
keyboards and keyboard -like devices such as all of the 
different HP language key boards, the IIP 9291 6A Bar Code 
Reader, and the HP 46086A 32-ButtOh Hox programmable 
function keys, These devices either generate keycode data, 
or as in the case of the barcode reader, generate L-SASCII 
data which can be translated to keycodes. The second group 
of input devices are those that generate absolute positional 
data as well as button information, These include graphics 
tablets and touchscreen^ The existing devices of this type 
report absolute positions for two axes, and may report zero, 



one, three, or four buttons, The third group of input devices 
are those that generate relative motion data. These im [tide 
two-button atui three-button HP-illL mice such as the HP 
46095A 3-Button (quadrature! Mouse, Ihe M1309A 
Trackball, Ihe HP 46085A Control Dial Module (nine-knob 
box], and the HP 40O83A Knob (one-knob box). The existing 
es nf this type mav report two or three axes of motion 
ami report zero, two, or three buttons. 

There are a few HP-HIL devices that are not easily 

gorized, For example, the HP 46084 A ID Module, 

wlni 1 1 is used to prevent unauthorized software diiplica- 

i ion, does nut generate any input, tmi occupies <l position 
on Ihe HP-HIL. It currently cannot be accessed through the 

X server. A client program can access it directly, but not 
across a network. Audio extension modules, such as the 
iiP 46082A, do not occupy a position on the HP-HIL, but 
X functions exist to access the beeper contained in the 
module. 

Core Input Devices 

tip to seven intuit devices can be attached to one HP-HIL. 
There is no standard definition In X for determining which 
of those devices should be used as the pointer or the 
keyboard, In Starbase/Xll Merge, explicit specification of 
the core devices is done through a configuration file- The 
name ol the configuration file is constructed using the dis- 
play number specified by the tisei when X is invoked. 
Because that number is under the control oi the user, mul- 
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tiple configuration HI us with differenl name; 
can be used to specify different input devices as the core 
devices. When a device is chosen, it can be specified either 
by giving the name of its device file and its intended use, 
or by giving an ordinal posttteeb Ifirsl. second, etc.) and the 
type of device, along with its intended use. The position 
ol the device is relative to other input devices ol tBe same 
type on the HP-HIL with the first device being ih n 
closest to the computer. For example, a graphics tablet can 
be specified as the pointer device with a line in the config- 
uration file of the form devM2 pointer or with a line of the 
form first tablet pointer. 

It is possible to specify explicitly that the server operate 
with no pointer device or no keyboard device, or both. In 
add i I ion, the keyboard can be specified as the keyboard 
device and the pointer device. This feature is provided for 
working environments where it is not desirable to have a 
separate pointer device. If a keyboard is used as the pointer 
device, the user can specify in the X server configuration 
file which keys cause the pointer to move and the mag- 
nitude of movement. These keys are taken over by X and 
are not available for use l.\ i Jienl programs. To prevent 
conflicts in the use of these keys between X and clienl 
programs, it is possible to specify that the keys should be 
used for pointer movement only if a specified set of the 
modifier keys (e.g., left Shift, right Shift. CTRL, left Extend 
char and right Extend char} are pressed at the same time. 
The user can also specify which keys should be interpreted 
as buttons for the pointer device. 

Default choices for the core devices reflect the devices 
most commonly used as the default keyboard or pointei 
device. For example, if a keyboard is attached to the HP-HIL 
and can be opened by the X server, it is used as the keyboard 
device. If more than one keyboard is all Etched, the last one, 
that is, the one most distant from the computer on the 
HP-HIL, is used. If no keyboard can be opened by the server, 
the last key device, such as a barcode reader at 3 2 -hut ton 
module, is used. For the default core pointer device, if an 
HP-HIL mouse is attached to the HP-HIL. it is used as the 
pointer device. If no mouse can be opened by the server, 
the last device on the HP-HIL that can generate motion 
data is used. If no such device can be found, the keyboard 
is used as the pointer device. If the motion device chosen 
is one that can report more than two axes of motion, axes 
beyond the first two are ignored. 

Some additional functionality was provided for HP 9000 
Series 800 Computers, These machines are capable of sup- 
porting up to four HP- MIL loops, each of which can be 
associated with a set of input devices. Our goal for these 
machines was to provide rn.ixi mum flexibility in specifying 
in fiut devices while still providing reasonable defaults if 
no specification is made. The method chosen provides a 
default based on the display number specified when X is 
invoked. This display number is used to determine which 
configuration files are used in initializing the server. 

The user can specify an HP-UX path to be searched for 
all input devices or the path to be used for an individual 
input device. This functionality was implemented to allow 
the HP-HIL path to be explicitly chosen on HP 9000 Series 
800 computers. However, it also proved useful during proj- 
ect testing. A test tool that was written to simulate HP-HIL 



driver input used this feature to simulate inpul from vari- 
ous input devices. The result was greatei flexibility in 
Lng various combinations erf hardware. See the article on 
page 42 for more information about project testing 

HP Input Extensions 

Although the core protocol of the X Window System is 
standard a< ross all vendors, X was also designed to allow 
individual vendors to implement extensions to that pro- 
tew oL This allows vendors to add functions that are specific 
to their hardware or software requirements, or that are not 
included in the core protocol. If these extensions are found 
to be useful for the general X community , a procedure 
exists to propose them as standards to be included in future 
releases of X. 

This was the method chosen to add support for HP-HIL 
<es within the X sei vrr [J provided a solution thai mi't 
the needs of X clienl s. while also providing Starbase il- 
w r ith information from input devices that could not be re~ 
ported through the core X protocol. See the box on page 
39 for an example of X protocol and X extension format. 

There are two parts to most X extensions: library func- 
tions to invoke the protocol requests it defines, and a server 
portion to process Lhe requests and implemenl I be fmu 
lions, The X protocol defines the format of requests in the 
X library. An input X extension is more complicated than 
other X extensions because it also involves the creation of 
new input events* code to generate the evenls within the 
server, a mean-, td allow clients to ask to receive those 
events, and code to route the events to the appropriate 
clients. Unlike many extensions, input X extensions re- 
quire additions lo both the device independent and device 
dependent portions of the server* 

To provide funclionality equivalent to thai provided for 
the core devices, it was necessary to implemenl protocol 
requests lhat are analogous to core protocol requests and 
also allow the user to specify which device should be manip- 
ulated. These functions include lhe ability to select input 
events from a device, control the focus of that device, and 
"grab" [temporarily take exclusive control of| a device. 

Other necessary functions include those lhat allow a 
client to list all the input devices available to the X server, 
and functions to enable and disable those devices. Also, 
input events for this extension were defined so that more 
than two dimensions of motion data could be reported. 

Technical Issues and Trade-offs 

The major input extension implementation issue we en- 
countered was how to treat input devices other than the 
pointer that report motion data. The position of a typical 
pointer* such as a mouse, is tracked by the server and a 
cursor is echoed at that position on the display by the 
server, A keyboard takes its position from the pointer, and 
its focus is either explicitly set or is determined by the 
position of the pointer* It was obvious that additional key 
devices should be treated like the keyboard, but it was not 
obvious how additional motion devices should be treated. 

The alternatives were either to treat all devices supported 
through the extension like the keyboard or to treat addi- 

*When d keyboard key is pressed, one of the parameters returned to itie appiica^' i 
pDintej i cursor 1 pofehor- 
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tional motion devices like the pointer, H they were all 
d like the pointer, the server would have to track thesr 
position and echo a cursor for them, and not allow their 
focus to he explicitly set by the client. If they were absolute 
devices* I heir input would have to be scaled 
If instead the] treated like the keyboard, the si 

would not have to track their position individually but 
would take it from the position of the pointer. The server 
would nj ursor for them, but would leave that up 

tents and allow their focus to be explicil give 

clients maximum flexibility, it was de treat all 

devices supported through the extension like the keyboard. 

Input Devices and Starbase 

The Starbase library provides functions to open input 
devices and to receive Two- or three-dimensional world- 
coordinate input. Several device driver illow 
Starbase to receive input from different devices or from 
the same I il'ferenl environments, In someof these 
environments to input devices has bei sive. 
. only one program at a time to open and access a 
device. Shared devices for Starbase applications have been 
supported under previous HP window systems, but only 
for a pointer and a keyboard. Therefore, the major Starbase 
contribution to this project has I>' j < j h pmy tiling the ability 
for multiple programs to share aU input devices. 

At first it was not known how to achieve the des 
device sharing functionality. However, once it was deter- 
mined that an input extension would be provided, the basic 
approach was to provide device driver code that uses either 
core or X extension Xlib calls to obtain input from the 
requested devices. In this manner, i he X server provides 
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shared access to all devices for both Starbase and X clients 
(see Fig. 1). The X server arbitrates the sharing of input 
en programs, and applies normal focus rules 
to Starbase and X programs. The new device driver code 
is similar to the existing Starbase HP-HIL driver code, dif- 
fering i*nly in how it obtains input from a de 

The syntax oft he gope he input 

be open- nhanced specifica- 

tion of an input device and window combination. 
allows the driver to make a request in the form expected 
open that device and request input from 
it. Sine e many Starbase programs specify this information 
through HP-UX environment variables or program param- 
eters, they I an take advantage of the enhanced syntax with- 
r tanging the source code of the progi 

ft was possible to access the core input devices through 
Starbase input requests in previous releases of X, and com- 
patibility has been maintained so that client programs can 
continue io access these devices as before. However, in 
previous releases of X. except for the keyboard and pointer* 
it was not possible to access input devices in a manner 
that would allow them to be shared among programs, Also, 
it was not possible to access them across a network. As a 
result of this project, programs can take full advantage of 
the window system and network, while continuing to use 
'vices and access them for Starbase input. 

Direct Access to Input Devices 

Client programs can open and access input devii es di- 
rectly that are not in use by the X server. This allows a 
program that was not written for a windowed environment 
to continue to work. However, only one instance of that 
program can be run at a lime, thus preventing other X 
clients from using that device. Although a good feature foi 
existing programs that do not require a windowed environ- 
ment, din &ing oi input devices is not a recom- 
mended practice for any newly writ tenor ported prngrains. 

The core pointer and keyboard devices cannot be directly 
accessed by client programs, since the X server open3 those 
devices. 

Conclusion 

The result of this project is that existing applications are 
supported, and an easy transition to a windowed environ- 
ment is provided for them. As shown by Hg. 1. programs 
a number of optional ways to access the input devices, 
i sive access to input devices other than the core de- 
3 is supported, although not recommended for new 
clients. Shared access through X libraries is supported for 
both core and extension input devices. Shared access 
through Starbase Input routines is supported for both < pre 
and extension input devices, and is provided in a way that 
minimizes changes to existing Starbase programs. 
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Sharing Testing Responsibilities 
in the Starbase/X11 Merge System 

The testing process for the Starbase/XH Merge software 
involved setting realizable quality goals , and using 
extensive test suites and test tools to measure and automate 
the process, 

by John M. Brown and Thomas J. Gilg 



WITH THE DEVELOPMENT OF the Starbase/Xl 1 
Merge environment, new forms of testing had to 
be considered. Before the Slarbase/Xll Merge 
project, the X test suites consisted of nearly 450 tests, and 
the Starbase test suite contained marly 400 tests run across 
an average of 40 hardware configurations. The challenge 
was to make the appropriate modifications to I his extensive 
set of tests to make them useful in the Starbase/Xl 1 Merge 
environment, In areas where the existing test suites were 
inadequate, new lest tools and tests had to be developed. 

Test and Quality Goals 

The combination of existing and new test suites needed 
to ensure adequate code coverage, Adequate code coverage 
in this context means exercising all procedural inter hi* &s 
(i.e., X and Starbase library calls], and the in-depth testing 
of each procedure. An HP software tool known as the 
branch How analyzer (BFA] was used to measure code 
coverage, Code quality was measured in terms of defect 
densities and defect arrival rates. The project quality goals 
were stated in terms of acceptable defect densities (del eels 
I'M tO KNCSS*) for each class of defect severity. Further- 
more, defect arrival rates (defects per 1.000 test hours) were 
closely monitored throughout the project, and objectives 
were set to achieve specific diminishing arrival rates at 
project checkpoints. 

Strategy 

Existing test technologies for X and Starbase were re- 
viewed for their suitability in testing the Slarbase/Xll 
Merge system. In several cases, Ihe existing technologies 
and their related test suites required no modifications. In 
other cases, weaknesses were identified and an effort was 
undertaken to enhance the remaining test tools and test 
suites. With nearly H5i) pre-Starbase/Xll Merge tests and 
several hundred megabytes worth of time-proven archives, 
the value of such an undertaking was obvious. Two test 
strategies were undertaken. First, new tests were developed 
that could be directly incorporated into the existing tesl 
suites. Second, for all the test scenarios not covered, new 
test tools and tests were developed, 

In all cases, a high priority was placed on the automation 
of tests. A best-case scenario was envisioned in which all 
the code changes, deletions, and additions developed in 

Thousands af noneamrnerri sou-rce statements 



one day would be tested overnight on all available re- 
sources, and a summary of the tost resells would be gener- 
ated automatically for inspection by the engineers the fol- 
lowing day. in addition to the testing effort, code reviews 
helped round out the quality assurance effort. A code re- 
view or code walkthrough was conducted for each new 
code module. Attendance included the code an In or. a mod- 
t-ralor or code reader, and several reviewing engineers. 

Testing Measures 

To help guide the testing effort, several test and finality 
metrics were identified and used. These metrics include: 

■ Branch Flow Analyzer (BFA) Coverage, The branch B.qyi 
analyzer provides a measure of how well all the code in 
the software under test is exercised [covered] during the 
testing effort. To use the BFA. the source file to be tested 
is run through a BFA preprocessor winch places counters 
at all conditional statements and at the beginning of all 
procedures (see Fig la), The source file produced by the 
preprocessor is then compiled in a standard manner. 
When the program is run. the counters embedded in the 
code update an exlernal disk-based data base, which can 
later be analyzed. Analysis of the BFA data base provides 
a summary of which procedures am railed and a break- 
down for each procedure is given sin ivving which condi- 
tional paths were executed, or more important, missed 
(see Fig. lb). The BFA tool identified unexercised sec- 
tions of code to be targeted when writing new tests, 

■ Defect Density. To measure the current product quality. 
the detect density described the expected number oi se- 
verity weighted defects [critical, serious, low) per It) 
KNCSS. 

■ Defect Arrival Kate. As a nay Jo sense trends in quality, 
the defect arrival rate described the number of deh j i K 
found per 1000 hours of testing, 

■ Continuous Hours of Operation, A continuous hours of 
operation test was frequently executed to give an indica- 
tion of X server robustness, and to reveal any long-term 
execution side effects (e.g., memory utilization growth]. 

Engineer Test Suites 

The end users for the Starbase XI 1 Merge product are 
software engineers who develop high-performance graphics 
applications running in windowed environments. With 
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this information we figured that some of the best test cases 
could be leveraged from the engineers developing the Star- 
base t\ 11 Merge code. Therefore, an effort was made to 
formalize the process that engineers naturally go through 
when trying a new version of the X s the first I 

All engineers wen lop a short list describ- 

ing the t; Tilly tried. When an in? 

I roached. all engineers ran through their mini- 
suites and pr- With little additional effort, 
such testing proved valuable, 

Starbase Test Suite 

The Starba- illy been used to 

perform testing of the Starbase graphics library on ail of 
MPs supported graphics display devices and workstation 
configurations. The test suite consists of nearly 400 test 
programs, archive files of expected results, and various 
shell scripts and C programs that control test suite automa- 
tion. 

When a test program is run as part of an automated 
sion, the resulting standard outpui .nul e: 
against the expected result archives, In addition, represen- 
tations of the various graphics images thai may have been 
generated by the test program are compared with the ar- 
i tines Specific differences between actual and expected 
results are noted in a test suite log file, and simple pass/fail 
information is placed in a summary file. 

Before the Starbase X 1 1 Merge system, the test suite was 
used to test Starbase running only on a raw display device 
rather than in a windowed environment. With the advent 
of the Starbase/Xll Merge system, there was a need to 
enhance our Starbase testing approach to Include not only 
taw device testing, but also testing of Starbase in j&e X 

Winrlnvv System en vimurnent. 

Starbase test programs in the Starbase/Xll Merge en vi- 
rorunent take two basic [nuns: 

■ Window Naive. A window naive lest ( an run either in 
raw mode or in X. The test itself has no knowledge of 
X. and does not create X windows itself, but instead 
relies on an outside mechanism to create the windows 
and direct the test to those windows. 

■ Window Smart, A window smart lest can only run in X 
By definition, a window smart program makes X calls. 
and usually creates its own output windows 

The enhancements made to the Starbase test suite had 
to be able to support both varieties of test programs. An 
additional goal ot the changes was to leverage as much of 
the existing test suite as possible. To lesl window naive 

programs, the test suite was modified so 1 1 
could recreate various selected X window scenarios and 
then run test programs in each scenario Since window 
naive programs can be run on a raw display or in an X 
environment, we were able to use a set of the existing test 
programs ha these scenarios- Of course, new archives of 
expected results had to be created for each scenario. 

To cover window smart testing, an additional X window 
scenario was used in the test suite. Also, since none of the 
nig test suite programs contained both Starbase and 
X library calls, a set of new test programs bar I to be written 
to test this new functionality adequately. Areas oi partii u 
Uu testing attention included text fonts, cursors and echoes, 
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i oktf Jtuiii mfanipufeBiMi, backing store, double buffering, 
and z-buffering. 

Once the changes to the test suite were in place, the suite 
was run nightly in a tesl center stocked with a complete 
set of graphics display devices and workstation configura- 
tions. An additional set of fools was developed to gather 
and report test results automatically from each configura- 
tion on a daily basis. This was done even during the latter 
part of the Starbase/Xll Merge project implementation 
phase and it enabled developers to track the quality of their 
code as it was being completed. During the testing and 
release phases, the nightly test suite results helped ensure 
continuing improvement in code quality and stability, 

X Test Consortium Test Suites 

Through HP's affiliation with the X Test Consortium, 
several X test suites were acquired. The Digital Equipment 
Corporation's X test suite (nearly 350 tests] tests each call 
available M the X lib library. The tests themselves come in 
two categories: good-only tests or centerline tesls which 
just test for expected functionality. Validate and error tests 
expand on the centerline tests by checking lor robustness 
using invalid parameters and erroneous functionality. 

The Sequent Computer Corporation, which is a member 
of the X Test Consortium, provided an X lest suite that 
consists of nearly 125 tests that exercise the server at the 
X protocol level. The tests tiu-mselves do not use XI ib, but 
instead contain custom buffering routines to send X pro- 
tocol requests to and receive replies from the server. Tim 
object of these tests is to see how well the X server handles 
malformed protocol packets not normally generated 
through the X library calls, 

Early in the testing effort, the decision was made to make 
the X Test Consortium suites more manageable by con I rol- 
ling them with HP's scaffold automation tool 1 The scaffold 
provided the framework to manage the large body of tests, 
and also provided some input and output archiving. With 
the scaffold in place* the test suites were run nightly by 
an HP-UX cron script on all unoccupied workstations used 
by I he Starbase/Xll Merge development team. 

HP-HIL and Input Extension Test Suite 

With the addition of several input extensions to the X 
server, a new input extension test suite had to be developed. 
Previous input testing tools proved to be inadequate for 
three reasons; 

■ HP-HIL (HP Human tnterface Loop) activity was usually 
captured after some processing of the HP-HIL activity 
had already occurred, 

■ Previous test tools required that the code under test be 
modified to accommodate the test mechanism. 

h Previous test tools could only handle keyboard and 
mouse activity, thereby excluding the new HP input ex- 
tensions to the server. 

The HP-HIL simulator, which was leveraged from an 
existing HP Windows/9000 test tool, allows multiple HP- 
HIL devices to be simulated and tested at once, including 
the new input extensions. The HP-HIL simulator operates 
in record/playback modes. The record mode requires the 
HP-HIL devices, the simulator, and a tester to run the test 
and use the HP-HIL devices. When it is recording, the 



simulator captures all HP-HIL activity and puts it inlo a 
lile. in playback mode, the simulator uses the file captured 
du ring the record mode in place of the real HP-HIL devices, 
The tester only needs to start the test program in playback 
mode. All of the HP-HIL data, regardless of its source, is 
sent to the server. 

The HP-HIL simulator is installed by creating a pty 
udo tty) in the tmp directory for each input device on 
the HP-HIL loop. This sets up a communication path be- 
tween the ptys and the real HP-HIL devices. Tu ensure that 
the X server will use the ptys in tmp, an appropriate entry 
is made in the server's Xndevices file to change each device's 
juth train dev to imp. The Xndevices file is used by the server 
to determine its input device locations. 

When a recording test session is started and the server 
tries to open what it thinks is an HP-HIL device, it is con- 
nected to a pty and the HP-HIL simulator is triggered to 
open the real HP-HIL device. Once this is done, the HP-HIL 
simulator, transparent to the test program, passes all tlP- 
HIL device activity back and forth while saving all HP-HIL 
activity along with timing data into a file. The timing data 
ensures thai realistic playback is provided. Fig, 2a shows 
the setup for test recording. 

For HP-HIL playback, the file that was saved during re- 
cording is simply read by the simulator, and the appropriate 
HP-HIL activity Is generated in the same time sequence it 
was recorded and fed into the pty. During the playback 
sessions the real HP-HIL devices do not have to be present 
mi I he HP-HIL loop. This facility allows suites recorded 
j-ing the HP-HIL simulator to run on any machine without 
concern for the presence of HP-HiL devices — which are 
sometimes hard to find. The setup for playback is shown 
in Fig. 2h. 

The HP-HIL simulator was used to test the server input 
extensions, and was then incorporated into the test scaf- 
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Fig. 2. (a) Initial setup for recording data from HP-HIL de- 
vices (b) Setup for playback. 
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fold. The simulator was also used by another group to 
simulate foreign versions of the HP-HIL keyboard ti 
native language support (NL5) functionality, 

GRM Test Suite 

The graphics resource manager It rK : of a 

daemon process and a client interface library. The suite of 
developed for the GRM system is partitioned 
according to the various functional components of the 

A (est module was developed for each of the fol lowing 
functional categories: 

■ Ciii ;er Protocol. The serial data stream between 
the GRM client and the GRM daeir; 

■ Object Allocation [ii tap bores), The mainte- 
nance of all display hardware resource allocations. 

* Offscreen Memory Management. The allocation and 
deallocation of three-dimensional blocks ol offscreen 
memory. 

■ Shared Memory Management. The creation, allocation* 
and dealJocation of chunks of shared memory 5 . 

■ Sequence Control. The maintenance of request se- 
quences for multiple processes. 

■ Listing of Objects. The wild-card matching and listing 
of all GRM objects. 

With the exception of the protocol Test module, all of 
these test modules tested the operation of the GRM daemon 
through the standard GRM interface library. For the pro- 
tocol test module, some library routines were replaced with 
altered > 'he original library routines to achieve 

the desired test procedure. 

Although the GRM daemon is designed to operate with 
multiple clients, the tests were designed to have exclusive 
use of the GRM, [f another GRM i Ileal process was detec led 
liy the lest process, the test would identify the error and 

■ ■\ii Since only one GRM daemon will run on a single host 
at atiy parth Lttaj time, the test environment had to he free 
of any graphics applications that used Starbase or the X 
server. 

XDI Test Harness 

The X driver Interface, or XI H, has about four dozen entr\ 
points in the device dependent portion of the X server. 
The X driver interface provides an interface between a 
translation module and the low- level X display drivers that 
perform the actual display control and rendering opera- 
tions on the display hardware. The translation module is 
responsible for translating requests from the device inde- 
pendent portion of the X server into a form sui table foi 
X display drivers. Tins architei ture allowed independent 
development by HP engineers in two different organiza- 
tions and locations, and provided a platform for code shar- 
ing The lone at HFs Corvaliis Ittfoi 
tion System Organization, and the display drivers (for X 
and Starbase] were done at HP's Graphic Technology Divi- 
sion. The article on page o describes the Starbase \ i i 
Merge X server and the XDI f and Fig, 2 on page 9 shows 
ihe X server architecture. 

With the signify anf advantages of this newly defined 
Interface, there came Corresponding new testing demands, 
because high-quality, well-tested X displav drivers had to 
be delivered at regular inter vals, and these drivers had to 



be developed whether or not any server code was available, 
lie much ol the underlying driver code was shared 
by the Starbase driver code* the X driver interface was 
tailored to the needs of the X server, The differences be- 
tween the Starbase driver interface and XDI were suff^ 
to prohibit direct use of the Starbase test suite. Since the 
test suite could not be dire< ither approa> 

explored that would meet ng needs and lev 

uch of ti st suite technology as possible. 

To provide a tool for debugging and automated tesi 
the XDI test harness was developed. The harness provides: 

■ A user interface for each XDI entry point 

■ A means for importing and manipulating the associated 
data structures 

■ Support for a subset of C programming language com- 
mands. 

What makes the harness an unusual testing tool is the 
way in which it acts as an interpreter that receives input 
commands either interactively or from text script files 

The XDI tesi ban rs several advantages ovei more 

traditional testing approaches that Involve compiling vari- 
test programs and then linking each of them with the 
code under test. The harness needs to be linked only once 
with the code under lest, and since the harness is interpret- 
er-based, any number of test programs can be run without 
the need to link each one The harness also makes tesi 
programs easier to write and modify because it provides a 
convenient interface to the XDI entry points and the ability 
to manipulate data structures. Finally, disk spate is con- 
served because onlj the harness and aol the numerous test 
programs need to be linked with the large driver libraries, 

i resull ot these advantages, the XDI test harness 
proved to be a useful tool for XDI code development and 
debugging. In addition, with relatively minor changes to 
the Starbase test suite tools, the XDI test harness was inte- 
<1 directly into the lest suite. An extensive Set oi iww 
harness test programs was developed to test all the types 
of graphics display devices supported by the Staxbase/Xl1 

Merge system. Once the touls and lesl programs were \\i 
place, this new XDI test suite was rue nightly in the test 
center. 

Interactive Testing 

While is wnfl desired to automate as m.m\ tests as po ■• 
hh\ noi all server activities could be automated Further- 
more, a measure of randomness not provided by the auto- 
mated tests needed to be added, Areas especially suited 
for this type ol testing Included object manipulations with 
the X cursor (e.g., moving a window), screen changes when 
running in a stacked screens mode, multiserver envinei 
merit, and Starbase echoes (cursors operated by Starbase) 
in X. Usually, interactive testing allowed a wider range of 
s< i nanus to be tried. Winn i <rtain scenarios were iden- 
tified as productive, an attempt w r as made to autoi 
them. 

Conclusion 

With approximately 500 KNCSS between X and Start- 
and oyef 8€ different hardware i onfiguratioriSj testing the 
Starbase Xt 1 Merge system proved to be very challenging 
Available test tools and lesl suites provided the bulk ol 
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our automated tests, while the branch flow analyzer cover- 
age led to the development of new test tools and many new 
tests. During the latter half of the Starbase/Xll Merge proj- 
ect, we realized there was a need for more user-interactive 
tests, While automated tests are indispensable, we found 
that a great many interesting and important defects can be 
uncovered with the randomness provided by user-interac- 
tive testing. 
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A Compiled Source Access System Using 
CD-ROM and Personal Computers 

HP Source Reader is in use in virtually every HP support 
facfltty around the world, giving local support engineers fast 
access to complete source code listings for MPE, the 
HP 3000 Computer operating system. 

by B. David Cathell, Michael B, Katstein. and Stephen J. Pearce 



HP SOURCE KEAUKK IS A SYSTEM for accessing 
compiled source code stored on compact disk read- 
mnJv memory [CD-ROM ) for purposes ol system de- 
bugging I be spuri e i pdg is stored in a proprietary format 
that optimm ri trtevaj bj the w \ e&s program rutuiiBg ■ m i 
an HP Vecti'a Computer, 

HP Source! Reader facilitates quick and efficient debug- 
ging of HI 1 :iUO(J Computer systems by allowing the user 
to display source code at any point within a specified pro- 
cedure or segment The user can then quickly scroll the 
display or jump to any other location with precise control 
Relevant in fun nation can be "popped" onto the screen in 
■ aids. This includes identifier definitions, reference 
materials, and the assembly code corresponding to each 
source lion. The program also provides many useful aux- 
iliary functions including searching, printing, logging, and 
a comprehensive set of customization options. A coolexl 
MmsitiVH help facility eliminates the need to consult writ- 
ten documentation. 

L l nl ike other source browsing syslems. HP Source Reader 
vm$ iv ri IE en by and for engineers who debug HP 3000 Com- 
puters, "the user interface is designed to be familiar to 
support engineers who may not be knowledgeable about 
juMsnnal computers. The program prompts users for infor- 
mation in the same format as oilier tools they use. In addi- 
tion, [o make the program easy to use, HP Source Reader 
takes lull advantage pi the personal computer user inter- 
face, including keyboard, mouse, pop^up windows, and 
menus. 

To our knowledge. HP Source Reader is the first system 
in the industry thai combines the convenience of one-step 
source retrieval with the power of the CD-ROM and per- 
sin hit computer (PC) technologies, 

HP 3000 Debugging— Before 

HP :J01H) Computers .ire debugged, for the most part, by 
analyzing dumps of the computer's memory. When a sys- 
tem fails, the operator dumps the memory to magnetic tape 
and then restarts the computer. The tape is forwarded to 
HP w j here it is formatted and analyzed by an engineer. The 
engineer must examine source code while reading the 
dump, comparing the failed system to what the source code 
indicates should happen when the system is running nor- 
mally, The engineer is constantly alternating between the 
source code and the dump throughout the analysis of the 



problem. 

Historically, memory dumps have been printed on paper* 
This worked fine when the HP 3600 contained less main 
memory, but this practice has gradually become uiilenahb- 
With the advent of larger and larger systems. Thereioiv. 
interactive tools have been developed that allow a dump 
tq be analyzed in an on-line mode, Overtime, these interac- 
tive tools have been enhanced to the point where they are 
now powerful on-line loots I hat allow engineers to locate 
and format specific information in a memory dump easily. 
However, as these lools have matured, no parallel progress 
has occurred allowing efficient on-line examination of 
source code. Engineers have continued to depend on 
printed listings stored in a shared library area. 

Fig. 1 shows the complex manual process that must he 
followed to locate specific source code in a listing from 
information presented in the memory dump. It should be 
apparent that this is exceptionally tedious. 

Project History 

In 1986. we began to rethink the strategy for the use of 
workstations within our organization (IIP Commercial Sys- 
lems Support). It seemed apparent that real productivity 
gains could be made by engineers, managers, and support 
personnel through the use of readily available PCs and 
software. 

At the same time, we recognized that it was becoming 
feasible to marry the PC to the emerging technology of 
optical media. This marriage could provide a platform for 
an engineer to access the massive amount of information 
required for system level support of MPE* the HP 3000 
operating system. 

It became apparent thai much of the time spent in analyz- 
ing system failures was not bringing expertise to bear on 
the problem. Engineers were spending too much time in 
overhead activities — walking to a library, finding listings, 
and engaging in the long tedious process of source location. 
We felt this strictly mechanical work could and should be 
automated. 

We refined our ideas sufficiently to produce a PC-based 
demo version of a program. This gave us the opportunity 
to evaluate the user interface with feedback from engineers 
who would be users of the actual program, if and when it 
was produced. It also gave us a method to communicate 
our vision to softw r are developers who had experience in 
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the CD-ROM industry but who did not necessarily have 
knowledge of our particular activities in supporting the HP 
3000. 

Unfortunately, when we surveyed what was available in 
an attempt to save development effort , all we found were 
natural-language-based keyword indexing data bas* 
gines. These are not a viable solution because there is a 
substantial dii natural language and com- 

puter language. For instance, the word "ball" has a small 
number of meanings which are reasonably consistent from 
document to document. However, the variable "x" may 
have many different meanings depending on where it ap- 
pears in the source code. 

Eventually, we concluded that there were no existing 
solutions that we could leverage to meet our needs— we 
would have to develop a prototype. This first prototype 
was the proof of the validity of the concept. It had enough 
positive aspects to justify the resources to rewrite and then 
i'\h*nd the programs. 

The main body of effort is now complete and the gener- 
ation of HP Source Reader CD-ROMs is becoming a routine 
manufacturing effort. The only remaining tasks involve 



small utility programs to automate some partially manual 
processes. In addition, we plan to continue to expand the 
functional ity of the access program as good ideas are 
suggested and as time permits their implementation. 

Project Goals 

At the beginning of the project our overriding objective 
was to improve the efficiency of HP 3000 system debugging. 
To achieve this, we established the following goals: 

■ Elimination of paper listings to save time, space, and 
mundane labor. 

■ Full use of emerging technology to make engineers' time 
as productive as possible. 

■ Ease of use to minimize learning time and errors, 

■ Minimal impact on organizations that supply source 
code to avoid the need to reformat source code or modify 
procedures. 

■ Cost-effectiveness to make it easy for support organiza- 
tions to justify the expense required. 

CD-ROMs and PCs 

CD-ROM Is ci lugicat choice for the paperless environ- 
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merit. A CD-ROM. a 4V£ inch plastic disk, can hold \\u\ 
equivalent of a 35-fool .stack of paper listings. This is suf- 
ficient capacity to contain an entire release of the HP 3000 
operating system and its supporting software. The disks 
are inexpensive enough that e&eh engineer can have a set. 
Lin like paper, optical media are machine readable, alio winy 
For a wide variety of automated access techniques. Because 
the CD- ROM is mad-only, il cannot be overwritten; it al- 
ways has the integrity it had when it was manufactured. 

The HP Vectra PC is an excellent system for implement- 
ing a high -technology, ergonomic access program. It has 
many features that make it comfortable for users — a mouse, 
a lull -color display, and the ability to pop up ami remove 
windows and menus as necessary. Since the Vectra is a 
personal computer, each engineer has private use of the 
system and its performance is not impacted by other users 
competing for resources. In addition, the Vectra supports 
a vast array of commercially available software and hard- 
ware products. Some of these products provide mechanisms 
for switching quickly between the source code and the 
dimip. capturing parts of both in an integrated document, 
The Vectra is widely available within HP and is already 
in use in many offices that would need to use HP Source 
Reader. 

Fig. 2 shows how IIP Source Reader is used to accomplish 
the task shown In Fig. 1. These two diagrams clearly show r 
the reduction in manual effort brought about by the access 
program, 

HP Source Reader 

HP Source Reader consists of two main parts. The lit si 
is the data preparation system, which is used to generate 
the CD-ROMs from the compiled source code as it is pro- 
duced by the lab. The second is the access program that 
runs on the Vectra, which is used to locate and display the 
source code stored on the CD-ROM, 

CD-ROMs are generated whenever a new version of the 
MPK V or MPE XL operating system is about to be released. 
Ifffli h disk contains all the modules associated with a given 
version, Fig, 3 shows the process flow used to convert the 
data from its original form (in the lab) to its final form [on 
the CD-ROM). Raw source code is maintained in the lab, 
then compiled with the output listing files submitted for 
inclusion on the CD-ROM. The compiler listings are pro- 
cessed in a series of steps to produce a magnetic tape set, 
The tapes are sent to a mastering facility, w T hich manufac- 
tures the disks. 

Structure of the CD-ROM 

The CD-ROM has exactly the same physical structure as 
the now familiar audio CD. The only real difference be- 
tween the two is the meaning of the information recorded 
on the optical media, which represents computer data in 
the case of the CD-ROM and digitized music on the audio 
CD. Data is recorded as a series of pits positioned in a 
continuous spiral [beginning at the center of the disk). The 
pits are read as ones and zeros when illuminated by a laser 
source. The bits are evenly spaced, requiring the drive to 
vary the rate of rotation to maintain a constant linear ve- 
locity. Additional bits are used to provide a high level of 
error correction. 



Additional structure is imposed to make H possible '" 
ose the CD as a random-access device. A standard layoul 
of the disk directories and files known as the High Sierra 
standard wm proposed and widely accepted within the 
industry. Microsoft Corporation was active in the definition 
of the standard and quickly produced an intermediate level 
driver that makes all High Sierra CD-ROMs look like very 
large standard DOS discs (albeit read-only). CD-ROMs re- 
corded using this standard bave approximately 550,000.001] 
bytes of available disk space for data and directories. The 
wide acceptance of this standard and the availability q\ 
the Microsoft CD-ROM extensions made it possible for our 
project to develop our access program using the normal 
DOS file functions. 

From the very beginning of the project, it was evident 
to us that the organization of the many files that would be 
on the CD-ROM was of paramount importance. A pom 
choice would have resulted in terrible performance. The 
resu Iting i lesi gn makes extensive use of DOS subdi rect 1 1 1 fes 
to group modules in a pattern logically similar to that of 
MPE. Fig. 4 shows the directory structure of the CD-ROM. 
The root directory contains only a file describing the con- 
tents of the CD-ROM. The second level subdirectories are 
of three types— one for system libraries, one for programs, 
and one for the reference documents. 

The system library subdirectory contains only a file list- 
ing all the entry points and segments for that library. The 
modules themselves are located in subdirectories below 
the system library directory. Each module subdirectory 
contains a set of files containing the compressed source 
code, identifiers, cross reference, procedure map. and op- 
tionally, the object code for that module, 

The program subdirectories contain a set of files contain- 
ing the compressed source code, identifiers, cross refer- 
ence, procedure map, and optionally, the object code for 
that program, 

The document subdirectory contains a set of files con- 
taining the compressed text page list, table of contents* 
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Fig. 2. HP Source Reader method of source code location 
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and index for each document. 

In addition to providing good performance, this structure 
has proved to be quite robust — only small extensions were 
required to include th< brought about by MPE XL 

Originally, we had only one system library subd 
and now we have three. In addition, a new directory type 
was defined for include files (these are files that are incor- 
porated into the source code of multiple modules to provide 
common definitions, etc 0- In the cast- of MPE XL, a set of 
files containing the compressed 90 '^itifiers, 

and cross reference for the large include file DWORLD re- 
sides in that directory. Thus this shared information is 
recorded only once, greatly reducing the amount of disk 
space required. 

Filters 

In the compact disc ind ustry. a filter is a program that 
ts some form of data and reformats it for use on a CD- 
ROM, The files on the CD-ROM are designed and organized 
to facilitate rapid retrieval of the desired information. The 
application designer can take advantage of the fact that 
Optica] media can he read but not written. Thus it is desir- 
able to do ds much processing as possible during the data 
preparation phase. This should result in less processing 
and, presumably, faster data retrieval by the access and 
display programs. 

For the HP Source Reader project to succeed, we had to 
minimize any additional effort that might he required of 
other organizations, in aur case, thai meant that the input 
data for the filler program would have to be the same com- 
piler-generated listing files that were already supplied for 
iac;h ivliiisr of MPE. These are exactly the same files that 
we previously printed and archived in our library- 



System 
Library 



^T # 



- 



Fig* 4. CD-ROM directory structure 

The initial prototype Biter was for SPL. the primary lan- 
guage used in MPE V. The result was tantalizing to that it 
apse of the tool that we had envisioned, 

We learned from this prototype when we began thr die- 
sign of the filter for Pascal /XL (the primary language used 
in MPE XL). The major goal was to automate thi* processing 
huge number of listing files. The logical solution 
i dald base thai would contain enough Information 
about each operating system module to make the need for 
human intervention minimal. Thus the filter could locate 
files on the system to be filtered , determine which, filter 
was to be used, and record the results of the filtering in 
the data base. This goal also dictated that filtering be done 
on a more powerful computer system than a PC, and an 
HP 3O0G Series 70 was chosen. 

A secondary goal was that the overall environment and 
the program structure he suitable for extending the filter 
for other programming languages. Proper design of the data 
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Fig. 3. CD-ROM production process. 
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base would easily allow extending the environment. To 
facilitate extending the filter program itself, we chose a 
three-pass philosophy. 

The first pass parses each input record and determines 
what part of the listing i I represents. It I hen reformats infor- 
mation tb he retained and writes it to the appropriate inter- 
mediate file. The second pass performs certain cleanup 
tasks such as removing duplicate information regarding 
identifiers, The final pass generates the target files for the 
CD-ROM. 

Although the first implementation using this three-pass 
philosophy was for Pascal/XL, we found that more than 
95% of the code was retained when we extended the pro- 
gram to handle Pascal/3000 (the MPE V version of Pascal], 
The second and third passes were only minimally changed, 
Perhaps this result is not very surprising in the case of 
such closely related Pascal compilers* However, we found 
that more than 90% was retained when we implemented 
the SPL version of the filter. With these three filters, we 
can now process 99% of the modules for both MPE V and 
MPE XL. 

Most of the processing is done by I lie filters. However, 
there is a need to accommodate certain complex modules 
that consist of multiple compilation units that may even 
be written in different languages, To keep the process as 
simple as possible, we filter each sub-module and later em- 
ploy a merge utility, which we also developed. This pro- 
gram uses the data base to determine which submodules 
need to he merged. The source, identifier, cross reference, 
and optional object files are retained but the procedure 
map files are combined. Each procedure entry in the 
merged map file indicates which submodule contains it. 

Writing the filters was not a trivial task. We encountered 
numerous difficulties. The biggest challenge was posed by 
inaccuracies in the compiled output. The filters detected 
numerous cases of cross references that didn't exist or were 
on pages other than what the compiler reported. The Pascal 
compilers support long identifier names but truncate them 
in many p laces. 

Additional challenges were provided by programmers. 
Some use NOUST compiler directives to turn off listing 
outpot. Others use the DEFINE construct in SPL to improve 
readability and shorten the code. Still others use different 
cross-reference programs whose formats are different from 
the ones for which the filters were written. 

P remastering and Mastering 

Premastering is the process of converting files from stan- 
dard DOS format to High Sierra format. The files output 
by the filters arc standard DOS file images, while compact 
discs are recorded according to the High Sierra standard. 
Premastering changes the structure, not the content, of the 
files. The conversion is done on a CD Publisher system 
manufactured by Meridian Data Systems, The output of 
the CD Publisher is a set of master tapes, which are then 
sent to a compact disk mastering facilitv. 

The mastering vendor takes the tapes and creates a CIJ- 
ROM master with the same data structure, This will be 
used to press CD-ROMs by a process identical to that used 
for audio CDs. The finished CDs are sent back to HP for 
packaging and distribution. 



Access Program Design Philosophy 

As mentioned above, a major goal for this project was to 
make the access program easy to use. This was especially 
important because most of the engineers who use it are not 
knowledgeable about personal computers, Therefore, we 
designed the screen layout with the major commands per- 
rnanently displayed on the second line. Above that line is 
an area that identifies the current procedure. It is also used 
for dialog for commands that require it. The remainder of 
the screen is used for displaying source code. 

Commands are invoked by pointing at them with the 
mouse. For systems without a mouse, the command can 
be selected by pressing the slash key [/) followed by the 
first letter of the command. When a command is seleclid. 
a menu drops down from the command line listing the 
subcommands, The user can point to the desired subcom- 
mand with the mouse or type the first letter of the subcom- 
mand. Prompts for additional information can be displayed 
on the lop line or in dialog boxes if more room is needed. 

Many of the commands require information such as the 
name of a procedure or a variable. We recognized that, 
while the program is in use. this information is probably 
already displayed on the screen. Therefore, we permit the 
user to move the alpha cursor by pointing at a screen posi- 
tion with the mouse, then selecting the command. When 
the user is prompted for the name of a procedure or variable, 
the access program automatically displays Ihe identifier 
above the cursor as the default value. 

Another design decision was the extensive use of win- 
dows — temporary boxes that overlay the main screen and 
contain information gathered from some other place in the 
listing, For example, if the user wants to know more about 
a variable used in the currently displayed code t the infor- 
mation is displayed in a window r overlaying the top of the 
code area. Once the user has finished with the window it 
is removed and the code area is restored to its previous 
condition. 

Although IIP Source Reader uses many windows, it is 
not a Microsoft* Windows application. At the time the 
project began, MS Windows was not an established prod- 
net. There was little known about OS/2 and Presentation 
Manager, Therefore, we decided to implement the access 
program as a character-based DOS application capable of 
running in various environments including DOS, MS Win- 
dows, and Quarterdeck DesqView.™ 

However, we also decided to structure the program in 
such a way that converting to MS Windows or Presentation 
Manager would be feasible without a complete rewrite. 
Thos. the program has a main loop, which checks lor a 
user action [keystroke, mouse movement, or mouse button 
press). Control then passes to a routine based on the current 
internal state. That routine performs some action, possibly 
changes the internal state, and returns to the main loop, 

Another important attribute of the access program is the 
speed of scrolling — we wanted it to be as fast as possible. 
Unfortunately, the access speed of the CD-ROM is only a 
hit faster than that of a flexible disk drive. Since most of 
our disk access is sequential, we implemented a buffering 
algorithm using buffers that are one sector long [2048 
bytes), A pool of buffers is allocated when the program 
initiates. The exact number depends on the amount of 
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memory available on the system. Buffers are linked in order 

from most recently used to least recently used. When a 
new one is needed, the least recently used buffer is cleared 
arid reused. This results in faster access than simply reading 
the individual records one at a time. Furthermore, the dis- 
play of information already in the buffers is very rapid. 
since no I/O is required, 

HP Source Reader is written in Turbo Pascal from Bor- 
land International with ex: ise of routines in 1 
Power Tools Plus from Blaise Computing. The program 
employs numerous overlays which are carefully organized 
to preclude the possibility of thrashing. 

Access Program Command Overview 

The HP Source Reader access program is designed to 
provide engineers with the most flexible interface possi- 
ble^ — one that provides commands that allow the required 
code to be located with mininuim delay. The program was 
developed by engineers who would use it in day-to-day 
work, so the command structure chosen complements the 
data provided by current tools. 

The main commands and subcommands of HP Source 
Reader are as follows: 

GOTO 

This is probably the most important command in the a« 
program. It allows the user to select the exact code to dis- 
play. Subcommands allow different types of access to the 
source, m MPE, each module is located either in a library 
or an application program* In MPE V/E and MPE XL com- 
patibility mode, procedures are grouped into segments. In 
MPK XL native mode, segmentation is not used. To provide 
a consistent user interface, HP Source Reader defines "native 
mode segment'' to be interchangeable with "module ."' 
GOTO has six subcommands. 

SEGMENT Allows the user to select a segment module 
name to be used for the starting point for displaying source 
code. Optionally, the user can also provide an offset from 
that starting point. The user can limit the search domain 
to specific libraries to reduce search time. 
PROCEDURE Identical to GOTO SEGMENT except that the 
user provides S procedure name as the Starting point. 
ENTRY. Equivalent to GOTO PROCEDURE with an implicit 
offset to the main entry point of the procedure. This by- 
passes declarations and nested subroutines, procedures, 
and functions, 

CALL Equivalent to GOTO ENTRY, plus the current module 
and location are saved in a logfile- allowing the user to 
return to this point at a later time. This mimics the call 
and return mechanism used by a computer. 
RETURN Allows the user to return to a place m the source 
code that was saved in the logfile as a result of an earlier 
GOTO CALL. 

APPLICATION- Allows the user to select an application pro* 
gram to be displayed instead of a library module. 



compiler for the selected identifier is displayed in a win- 
dow. This includes type, class, and location or value. 
DEFINITION. The source code containing the definition of 
an identifier is displayed in a scrollable window, 
LOCAL VARS. The identifier map information supplied by 
the compiler for all the local identifiers in the current pro- 
cedure is displayed in a scrollable window. 

SEARCH 

This command finds a specific item or pattern in the current 
module. Three subcommands determine the search method. 
Each can be done in a forward or backward direction. 
IDENTIFIERS. Finds the next or previous occurrence of an 
identifier as supplied by the compiler cross-reference table. 
TEXT. Searches forward or backward for text matching a pat- 
tern, whit h can include wildcard characters for increased 
flexibility, 

LEVEL Searches in the required direction for a specific 
block level. The block level is a function of the BEGIN-END 
statements in Pascal and SPL. Each BEGIN increments the 
level number; and each END decrements it. 

DISPLAY 

This Command switches the display between code and sup- 
plementary information while retaining the previously dis- 
played information. Seven subcommands select what infor- 
mation to display. 

CODE. Returns to the source code display, 
PMAP. Displays the procedure map for the current module. 
This lists procedures with segment offsets, if applicable, 
REFERENCE Displays the current page of the current refer- 
rltjcument. Useful documents such as internal specifi- 
cations are included on the CD-ROM. 
LIBRARY/MODULE-'AFPUCATION. Displays a list of procedures, 
Muidulc se^nnuils, or applications whose names match a 
I ia I tern. 
STACK. Displays the current logfile CALL history- 

TOGGLE 

This command coin nils the state of three binary switches, 

ABSOLUTE RELATIVE. Alters the way thai code offsets .lie 

displayed, they can be absolute (using the segment as a 

base) or RELATIVE [using the procedure as a base). 

HEX OCTAL Alters the radix of code offsets. 

SOURCE ONLYINNERLIST. Displays source code only or 

source code interspersed with the corresponding assembly 

Instructions generated for each source line, 

PRINT 

This command prints information to a printer or file. There 
are subcommands to control what is printed. 

REFERENCE 

This command selects a specif ir document or a location 
in that document using the table of contents or index. 



IDENTIFY 

This command displays information regarding identifiers 
defined in the source code. Three subcommands select 
different information to display. 
VALUE, The identifier map information supplied bv I he 



CONFIGURE 

This command is used to customize the program by select- 
inn miscellaneous options for the access program to use. 
These include display colors, screen size, printer, function 
keys, and CD-ROM drive location. 
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Fig. 5. Screens from a typical HP Source Reader session, (a) The GOTO Segment hardres 

command is entered, (b) Resulting screen, (c) invoking the IDENTIFY VALUE command 

(d) Resulting screen. 



HELP 

Context sensitive help text is provided to assist with any 
tlifl'i cully using I he program, For example, if the user is 
being prompted for some input, the HELP command dis- 
plays text that explains the exact nature of the input re- 
quired. This is most useful for a novice user, but even 
experienced users may need assistance from time to time 
with infrequently used features. 

QUIT 

This command gracefully exits from the program, A special 
logfile entry is rnade t saving the current location. This al- 
lows the user to issue a GOTO RETURN command the next 
time the program is run to resume displaying the source 
code that was being displayed when HP Source Reader 
was last terminated, 

An Example 

Fig, 5 shows part of a typical HP Source Reader session. 
An engineer is trying to locate the source line that aborted 
the system. From the memory dump the engineer has deter- 
mined that the code aborted in segment HARDRES at octal 
offset 16562. The engineer switches from the dump analysis 
tool to HP Source Reader. Fig. 5a shows the screen after 



the engineer selects the GOTO SEGMENT command and types 
the segment name and offset, HP Source Reader locales the 
source code at that location, resulting in the screen shown 
in Fig, 5b, The cursor is positioned on the source line 
corresponding to the return point from the cull to SUDDEN- 
DEATH— the engineer has found the call that aborted the 
system. 

From the code, il is apparent to the engineer that SUDDEN- 
DEATH is called if CHECKLDEV determines that the value of 
the variable LDEV is invalid. The engineer then needs to 
examine LDEV in the dump to determine what value it 
contained when the check failed. The engineer uses the 
mouse to point to LDEV on the screen, then invokes the 
IDENTIFY VALUE command. Fig. 5c shows the screen for 
doing this. HP Source Reader locates the identifier map 
information for LDEV and displays it in a window as shown 
in Fig. 5d, The engineer now knows that LDEV is found at 
location Q-%14, and therefore switches back to the dump 
analysis tool and examines the value of LDEV found at that 
location in the memory dump. 

Conclusions 

HP Source Reader provides substantial increases in pro- 
ductivity based on out personal experience, feedback from 
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support engineers, and management analysis. The lime {hat 
it takes an engineer to locate a specific source location has 
been i from several minutes to a few seconds. 

Further savings are achieved by direct access to supporting 
ination such as identifier maps, assembly code, refer- 
ence materials, and other sour it cost savings 
achieved by the elimination of paper listings. These 
igs include computer time, consumable items, 
for printing and binding, and storage costs. 

HF Source Reader represents an important contribu" 
to HP's commitment to customer satisfaction in support. 
Local suppon rs now have fast access to complete 

source listings Previously, maintaining such listings in 
every HP support « m il cost-effective. Today .im n^ 

iems are resolved by Held support personnel, Cun! 
ers realize this as imp] ^tem availability. 

HP SdUlCe Reader is now in use in virtually every HP 
support office around the world. Engineers tell us it is 

9d trademark o* Microsoft GwpafHfjan 



indispensable, and managers at all levels have gone out of 
their way to report that HP Source Reader has dramatically 
improved problem resolution time, 

HP Source Reader successfully combines new optical 
media technology with the ease of use and power of the 
PC. Designed with HP's traditional "next bench" develop- 
ment philosophy, it seems to be developing in : rhod 
for \LPE system support engineers who analyze 
ory dumps. 
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Correction 



In the left column on page 99 of the October 1989 issue, the 
words "parallel" and "perpendicular are transposed in equa- 
tions 3 and 4. Fig 2, and the associated text Rg 2a on page 
99 shows reflectivity R(fl), not reflection coefficient r(fl) as stated 
r in) ) Fig 2b shows R*(fl). which is the traction of light 
reflected after two reflections Also, Brewster's angle 8$ is approx- 
imately 61° instead of 59" as shown. 



DECEMBER 1&S9 HEWLETT- PACKARD JOURNAL 57 



)Copr. 1949-1998 Hewlett-Packard Co. 



Transmission Line Effects in Testing 
High-Speed Devices with a High- 
Performance Test System 

The testing of high-speed, high-pin-count ICs that are not 
designed to drive transmission lines can be a problem, 
since the tester-to-device interconnection almost always 
acts like a transmission line. The HP 82000 IC Evaluation 
System uses a resistive divider technique to test CMOS and 
other high-speed devices accurately. 

by Rainer Plitsohka 



TODAY'S STATE-OF-THE-ART DIGITAL ASICs (ap- 
plication-specific; integrated circuits) are charac- 
terized by faster and faster clock rales hi id signal 
transition times, In testing these devices, delivering the 
test signals to the device under test (DUT) and precisely 
measuring the response of the DUT can be a problem. To 
maintain signal fidelity, transmission line techniques have 
fed lie applied to the lesler-to-DLT interconnection. 

This paper illustrates how this critical signal path tS 
implemented in the HP 82000 It": Evaluation System to 
obtain high-precision timing and level measurements even 
for difficult-to-test CMOS devices, The HP 82000 offers a 
resistive divider arrangement that provides terminated 
transmission lines to the inputs and outputs of the DUT* 
This makes it possible to lest low-nut put-current devices 
up to their maximum operating frequencies. The HP 8200U 
tester also offers good threshold accuracy, low minimum 
detectable signal amplitude, and system software that sup- 
ports adjustment of the compare thresholds according to 
the selected divide ratio. 

Whether an interconnection between the tester pin elec- 
tronics and the DUT should be considered a transmission 
line depends on the interconnection length and the tran- 
sition timEj of the driving circuitry. If 



tpd > MP 



(1) 



where t r is the equivalent transition time (0 to 100%) and 
t pd is the propagation delay (electrical length) of the inter- 
connection, then the interconnection has to be treated as 
a transmission line, 1 For delays less than 1/8 of the tran- 
sition time, the interconnection can be considered a 
lumped element. 

Table I shows propagation velocities of signals in differ- 
ent types of transmission lines. Using equation 1 for a typ- 
ical ECL output or a modern CMOS output with a 20-to-80% 
transition time of 1 ns t or LB7 ns for to 100%, and using 
Tahle I for signal velocities, we can compute a maximum 
interconnection length of 1.25 inch (3.1 cm] for a microstrip 



Table I 
Signal Velocity in Different Transmission Line Media 



Type 

Coax, air 

Coax, foam-filled 

Microstrip, FR4 



Velocity 

1 ft [30 cm] perns 
8 in (20 cm J perns 
fi in (15 cm) perns 



line- There are no high-pin-count testers that even come 
close to such a short interconnection length between the 
pin electronics and the Dt JT. Therefore, a transmission line 
model must be used* 

Transmission Line Impedance 

Besides signal velocity, the line impedance Z } is a charac- 
teristic parameter of a transmission line. The value of Zj 
depends on the line type, geometric factors, and the elec- 
trical parameters of the materials used. Table II shows typ- 
ical values and tolerances. Note that Z, typically lies within 
a small range of values, and that quite high tolerances are 
usual. 

Table II 
Transmission Line Impedance Characteristics 



Line Type 

Coax, foam-filled 
Microstrip, FR4 



Range of Z[ 

50 to 100 O 
30 to 120 12 



Toleranci 

2 to 10% 
5 to 20% 



The choice of a value for Z l in a high-speed tester envi- 
ronment is influenced by three major factors. First, the 
outputs of ECL devices normally are designed to operate 
at Z ; = 50O. However, 2511 and 100O outputs exist. 

Second, connecting a capacitance C to the end of a trans- 
mission line forms a low- pass filter. This occurs in a tester 
when a DUT with input capacitance C in is connected to a 
driver. It also occurs at a comparator input, which has a 
lumped capacitance Q ampet1 (see Fig. 1), The low-pass fil- 
ter's step response transition time t H [10% to 90%) is: 
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t s = 2.2r = 2.2|Z,Ci 



(2) 



-ignal with transition time t t at the input tt> the filter 
will be slowed down to a transition time of l rttf at the output: 



^Tv* 






which adds additional delay at even* point of the original 
transition, For thr iinl this delay is approximated 

by the factors shown in Table III. 

Table III 

Delay for the 50% Point of a Transition Caused by 
Low-Pass Filtering 

K«t t t, = U 

Delay at 50% 0.7Z,C 0,92,C 1.0Z 3 C 

As a consequence, the impedance Zj should be as low 
as possible, because (. Hhai is. C l(1 or C| Ullipt ,a) is always 
[inn/ero. 

The third factor influencing the value of Z 3 is the required 
sciurce current capability. To minimize tt, Zj should be as 
low as possible. To generate a voltage step V s to propagate 
along the transmission line, the source has to provide cur- 
rent l A according to Ohm's law: 



h - v s /z,. 



(4) 



This is true for both the tester's driver circuit and the 
DUT. Proper design of the driver circuit will ensure suffi- 
cient drive current. However, some DOT outputs, espe- 
cially CMOS, cannot provide the current required over the 
satire range of Z| values shown in Table II. 

As a result of these considerations, a tester in which both 
accuracy and speed are important will use an impedance 
/, i>l 50(1. 

Termination Models 

To maintain pulse performance, a terminated signaJ dis- 
tribution system has to be used. Two methods of performing 
the termination are possible: parallel and series. 

Parallel termination uses a resistor R t = Zj at the end nl 
the transmission line, as shown in Fig. 2, At time t = 0, a 
voltage step V is generated by the the source. The forward 
wave will see the line as a resistor R = f t \. and therefore 
Vi(t *= 0] = V,j, At I = t J1(i the wave has reached the end 
of the transmission line, and because R t = Z|, 

A version of this paper mm ■ a< Itw IEEE European Test Conferee, 

Pans, 1989 




Driver 



OUT 



Comparator 



Fig, 1 . DUT interconnection model showing tow-pass filters 
caused by capactttve loadings. 



V 2 [i = t lul ] = V„. \o further reflections occur. The current 
that must be provided by the source is I = VVZj, and it 
flows as long as Y-jft) = \\, This model is applicable for 
ECL outputs. The resistor R t is connected to - 2 

The series term i nation method us* . 'j in 

series between the source and the transmission line, as 
shown in Fig, 3. At time t = G\ a voltage step V died 

by the source. The forward wave will see the line as a 
resistor R = Z«. Because of voltage splitting between R y 
and Z,. V 2 (l = 0) \t t - t H the wave has reached 

the end of the transmission line, and because of reflection 
at the open end. V 3 (t - 1 t - 0| = V , After 

t *= j reflected wave will reach the source side, 

giving V 2 {t = 2t pd ) = V . No further reflections occur, 
since the source side is terminated. The current l to be 
provided by the source is I = \V2Z t for the time 2t pti . 
This termination model is appropriate for a driver i j 
in the tester. 

Unterminated Environment 

When connef ting a source with R, Jllt A Z, to a transmis- 
sion line there is no matching element in the Circuitry. 
this situation arises when a DUT output, such as CMOS 
or TTL, is connected to a tester channel in which the driver 
has been set to high impedance and a high-impedance com- 
parator is used. Fig. 4 shows tftfi resulting waveforms for 
R oll , > Z| and R s > u! < Z ( . As can be seen, the transmission 
line mismatch creates a series of pulses that reflect back 
and forth (ringing). The amplitude and number of Steps 
depend mi the magnitude ol the mismah h, and thfcdurfttiOH 
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Fig. 2. Parallel termination model. 
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depends on thn propagation delay of the line ami Hie 
fiuiubej pi steps, 

Under these conditions, accurate timing and level mea- 
surement are not easy* 2 For repeatability of measurement 
results, the ringing should be completely settled before a 
measurement is made. Therefore, the device has to be tested 
at data rales far lower than maximum. Fig. 5 shows the 
relationship betWe^O the maximum possible test frequency 
and the electrical length of the interconnection for various 
degrees of mismatch (Le. t different device impedances), 
assuming two different settling criteria. One of the two 
curves assumes that the waveform is allowed to sellle 
within 10% of its final value before an opposite transition 
can be started. The olher assumes 1%. 

Tester Parasitics 

The basic elements of a tester's pin electronics are a 
transmission line, a driver, and a comparator. There is nor- 
mally also an ac/dc switch for performing dc measure- 
ments, This switch, implemented using a relay, can cause 
problems. However, by proper selection of the relay type 
and careful design, the transmission line impedance can 
bemainlained without significant parasitics. 

For stimulating the DUT\ the driver output signal is fed 
to lire pin. Because of the input capacitance of the fixturing 
and the pin capacitance [C in ]. the driver transitions will 



be slowed. This causes a delay as discussed abovr. Equa- 
tions J and 3 and Table III can be \l$$4 to Ggd€U&fi6 1 1 if 
delay, Also, because of input leakage currents Rowing 
through the driver's source impedance (R - Z, = 5011] Jhe 
driver levels will change, hor example, for an ECL device 
with f ih = 500 ^aA typically, there will be a voltage drop 
of l ih Z| = 25 mV,* Further problems will not occur 

For receiving DUT data, the comparator can be used in 
two different modes (Fig, fi): high- impedance (faigfa-Z] rirnj 
terminated (parallel j. 

In the bigb-Z mode, (he driver is switched to high imped- 
ance, resulting in a capacitance C|, irri] , vl[ formed by the par- 
asitics of the amplifier's switched-off transistors. Assuming 
a value of 3 pF for the compare chip (C, ) and 20 pF for the 
driver (Q) t C iumpKd = G t + C t] = 23 pF and the resulting 
step response time for the comparator input voltage is 2 ft& 

In the terminated mode, (he tester's driver is used for 
termination, eliminating the capacitance C d , Only the com- 
parator's input capacitance will limit the bandwidth, giving 
a step response time of 2/l[Q 'A\)!2 = 165 ps. This value is 
equivalent to an analog input bandwidth of 2 GHz. 

Fig. 7 shows the step response as a shmoo plot. The 
stimulus was a pulse with a transition time of l t = 200 ps 
from an HP 8 I'll A Pulse Generator. The measured value 
of the 10-to-90% transition is t m = 275 ps. The resulting 
intrinsic transition time t, of the comparator is therefore: 
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Fig. 3, Senes termination modei. 



t- = \/t" - t" 



V65 ps. 



"In ■'! - pacer ihe iLbscnpts o and i indicate oulpul and mpul parameters, respective y 
and Ihe subscripts h and I indicate n gh anc low logic leves rcspective-y. Subscr wz 6 
d, and g ncjicase the source, cram, and yate. respectively, o! a: tieid-eK&cl transistor 
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Fig. 4. Untermtnated modef. 
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Interfacing CMOS Devices 

\!OS devices are usually unable to drive transmission 
lines. The output impedance of CMOS devices does not 
match typical transmission line impedances, and static 
pow ition. which occurs when driving a termi- 

nated transmission line, ma] MOS device. 

Fig, 8 shou crating - of a CMOS 

output buffer cell. 3 The specified dc parameters V ohjmn at 
I uh and V olmax at I cil are marked. 

The output resistance is not linear. For high \ (U . the cell 
arts as a current source. For low V ds . it is a volt age source 
with a low resistance. The large-signal output resistance 
ined for either high or low output by: 



Fig. 10, The output voltage can be obtained by drawing a 






(5) 



where V <Js and l.j are corresponding values on the curves. 

The worst-case output resistance, R mdlt | or R ma *h< ls de- 
fined when V lls = V„ hmill or Y\ llrililx and I l( = I„ h or I ol . Note 
that there are major differences between typical and worst- 
case resistances. The resistance also varies with the operat- 
ing temperature. 

A CMOS output connected to a capacitance C will per- 
form as shown in Fig. 9, Assume that the source FET is 
turned on atl = H with V dii - V tld . The capacitor Ls charged 
with constant current, resulting in a linear ramping voltage. 
As the capacitor voltage hit Teases, V tlri and I he output resis- 
tance decrease. This decreases current flow into the 
capacitor, which slows the voltage ramp. The resulting 
capacitor voltage waveform resembles an exponential 
curve. 

The performance of a CMOS output driving a resistive 
load R hM ,| connected to a voltage source V ltwrJ is shown in 



fine defined I 



Vioftd and l r _. 



. The inter- 



sections with the FET characteristics define the output volt- 
age and current for source and sink operation. The transi- 
tion times depend only on the internal switching. Loading 
to the dc specifications can be obtained by using values 
for Rt lM<! and V iU3 < calculated as foil- 



Rload ~ V* ohmin ■ Ul ' 



'load - 



fcJ+Utf! 



16) 



lev ice loaded to I ah or I ot will have an output 
voltage of V ohmin or V D j max . respectively. A typical device 
will have an output voltage greater than V ohimn or less than 



' cilmax 1 



respectively. 



CMOS Driving a Transmission Line 

Connection of a CMOS mitput directly to an open-ended 
transmission line i* shown in Fig. IT Assume that K li%Jt > Zj 
dud a positive transition occurs at t = 0. At t = t flli the 
device output will correspond to the intersection of the 
FET's characteristic and the load line defined by V ( j 5 = Vjy 
and l d — V^'Zi- Calculating the device's output resisted e 
using equation 5, the output waveform behavior can be 
predicted as discussed above for the unterminated environ- 
ment. Because of the nonlinear output resistance, slightly 
different waveforms may occur depending on the actual 
V,! and Irf. When R ntj1 is less than Z,< l he second step on 
the source side may be higher than V, hi . If clamping diodes 
are im luded between the output and V lJ(ll this reflect if hi 
i an be reduced and further reflections will be inverted. 

For CMOS outputs with R wllt < Z|, termination can be 
achieved by adding a resistor R s between the output and 
the transmission line. The value should In-: 
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Fig. 5. Maximum test frequency m an unterminated environ- 
ment. Any DUT switching time ss assumed to be zero and 
any transition time & assumed to be zero 



This is the series termination model* which gives correct 
pulse performance. This method has been suggested in the 
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Fig. C, Operating modes of the receiver path. 



DECEMBER 1389 HEWLETT PACKARD JOURNAL 61 



)Copr. 1949-1998 Hewlett-Packard Co. 



past. 4 However, its practical applicability is limited* be- 
cause the output resistance for positive and negative tran- 
sitions is generally not equal, and the output resistance 
changes from sample to sample and is not stable with tem- 
perature. 

The Resistive Divider Solution 

The resistive divider provides a solution to the problem 
of embedding a CMOS device in a transmission line envi- 
ronment. This technique is implemented in the HP 82000 
1C Evaluation System. 

The operating principle of the resistive divider is to apply 
a definable dc toad to the BUT. Signal fidelity is maintained 
because the signal is fed into a parallel-terminated system; 
therefore* no reflections occur. 

Fig, 12 shows a schematic diagram of the resistive di- 
vider. 7Tie resistor R, is built into the tester. The resistor 
R s is selected by the user to give an appropriate divide 
ratio for the particular DUT. K ft is installed on the DUT 
board, which interfaces the DUT to the tester and is differ- 
ent for each DUT. The user then tells the HP 82000 software 
what the divide ratio is. The termination voltage V, in Fig. 
12 is also selected by the user. 

Besides providing a terminated transmission line envi- 
ronment, the resistive divider puts only a very small capaci- 
tive load on the DUT (shown as C prtI in Kig. 12). A value 
;is low us 2 pK can lie obtained if R s is close to the DUT 
pin. This is possible using ceramic blade probes with 
printed resistors. For high-pin-count devices (up to 512 
pins), the tester's DUT board can be laid out with easily 
installable resistors, keeping parasifirs below 10 pF. 

The length of transmission line between the DUT and 
the comparator does not affect the capacitive and resistive 
loading on the DUT. The termination is done by the tester's 
driver, which is part of the I/O channel. Therefore, the 
lumped capacitance that occurs if the driver is switched 
to high impedance is eliminated, This ensures a wide 
bandwidth for the compare path as discussed earlier. 

The DUT output levels detected will be reduced by the 
divide ratio. The resulting comparator input voltages can 
be calculated bv; 



> 
E 



a 



Q. 

E 
o 
o 



2BO0 

2100 

1400 

700 

0' 

-700 

-1400 

-2T00 — 



-2800 









::::::::{:::]::....... j-jiry: ::::::: 










III:!:: 








— 
























;::::::: 








- ■ .*::::::: 












- 
















_ 


















1 










\ I 


ill 1 


T I 1 


| 


1 


fill 



46 46.5 47 



49 49.5 50 



47.5 48 48.5 
Time (ns) 

Fig. 7. Shmoo plot of the step response at the tester input 
for an input signal with t r = 200 ps. 



V,. 



FL + R, 



(7) 



where V Q is the actual high or low output voltage under 
the defined load. This equation can also be used for cal- 
culating the appropriate threshold setting. For ease of use, 
this calculation is embedded in the HP 82000 tester soft- 
ware, so that a user always thinks in terms of noncom- 
pressed signals. 

Resistive Divider Parameters 

The selectable parameters of the divider are R s and V t . 
There are several choices for defining the DUT load. Device 
loading according to dc specifications is normally the best 
choice. The DUT's maximum power consumption will 
never be exceeded and throughput is improved. If the ac 
test is performed with the specified dc loading, the need 
for further dc fanout measurements is eliminated. 

Dc loading specifications can be converted to resistive 
divider parameters using; 
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Fig. 8. Output characteristics (t a vs V^ s ) of a CMOS output buffer, where i d ts the dram current 
and V^ is the dratn-to-source voltage across the FET. 
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Voltage Across C 



Output Characteristics 
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■ R, = (V oh - v Ql }/(|U+IU)-R, 

v _ v.,hH„il - v ol IUl 

IUI + IU 



(8) 



Special ac loads are defined for timing measurements as 
shown in Fig. 13. These ac loads can be converted to resis- 
tive divider parameters by the Thevenin equations: 



R, = 1/[1/"R] + l'R a ) - R t 

V, - Y^rVIR, ■ R 2 }. 



(9) 



There are situations where the values of R^ and V t calcu- 
lated usiiig equation 8 cannot be used. Changing the loading 
will change the output levels. The changed values can be 
obtained from the output characteristic curves. The actual 
values are defined by the intersection of the load line with 
the FET curve. The worst -case values can be obtained from 
the intersection of the load line and the worst-case output 
resistance line, as shown in Fig. 14. These modified Levels 
can be calculated using: 

V ol ' = (V,R mrtx] + V Dp |R to[lH )/(R max | + Rjofld) 

y = ( y i - v m]'(r™u + R i»Hd) iffi 

hb ** [Vt ~ V Dp h)/[R mElx h + ^load) 

where V j and V oph are the low and high level open-circuit 
output voltages, and the values for w r orst-case output resis- 
tance are given by equation 5. 

Modified loading may result in higher power dissipation 
for one of the output levels. It is recommended that power 



50* c 



Fig. SL CMOS output driving a 
capacitive load 



consumption be checked using: 



(in 



Practical tests have shown no problems as long as the 
level change caused is less than 500 mY. 

To measure I he red in ;etl output signals resulting from 
the resistive divider, the comparator must be designed to 
detect small amplitudes. Two parameters affect this ability: 
comparator hysteresis and open-loop gain. The hysteresis 
is a positive feedback effect to ensure the comparator's 
stability, The open-loop gain is the comparator's amplify- 
ing factor for small signals, and is frequency dependent. 
Both parameters affect the finite voltage swing (overdrive) 
around the threshold that has to be applied to the com- 
parator input to obtain output switching (see Fig. 15). 

A high-performance comparator design will ensure that 
the necessary overdrive will be constant up to the 
maximum data rate, Smaller pulses can be delected as long 
as sufficient overdrive is applied. For detection of a single 
transition, the value fur the flat section of Fig, 15 applies, 
and the limiting element is the input signal bandwidth. 

With a value of ±20 mV for the overdrive, and assuming 
a dc accuracy of ±10 mV, the signal's amplitude a! the 
comparator input should be greater than 60 mV. A TTL- 
compatible output will generate a swing of at least 2V. 
Within a 500 environment, this allows a maximum divide 
ratio r m .,, and a maximum value for R. df: 



* max 
R 



« 33 
» = 1600 J I, 



(12) 



This means thai devices (Having inilpul currents 
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Fig. 10. CMOS output driving a 
resistive load. 
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Fig, 1 1 . CMOS output driving an open -ended transmission line. 



IU + M ^ 2V/160011 * i„2 mA at TTL output levels can 
be lested, 

DUT Power Dissipation 

At higher test frequencies, the resistive divider results 
in Less E3UT power consumption than a capacitive load, as 

shown in Fi^. 16, 

I/O Pin Considerations 

The resistive divider is applicable to I/O pins, with some 
additional coo siderat ions. 

The tester's driver r:an generate a two-level signal. These 
levels should be set according to the DUTs input require- 
ments, that is, V| ?* V iJrT3ax , Vj, 3* V Ulll]iT1 , When receiving 
signals from the DUT, one of these levels has to be used 
lor termination. This may mean that the calculated V t does 
not match the driving requirements. V t and R 5 shook] be 
set to: 



v, * v., 



if 1„ 



I.. 



or V, 6 V u ,„ il!v if I,,, £ I ul , 



(13) 
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(V, 



J-l,,. 



The value of R, is modified to ensure that none of the 
output states will be loaded more than specified. This will 
occur if only V t is modified, Note that one level remains 
less loaded, If DUT power dissipation is not critical, the 
value of Rj need not be modified 

The device can be stimulated via the series resistor, 
CMOS normally has negligible input current, so no level 
errors occur. The input capacitance and R & form a low-pass 
filter, which limits the data rate and causes a delav at the 
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Fig. 12. Resistive divider model 

50% point on the transition: 

Data rate = 2.3(Z, - RJ[Q n + C par ) 
Delay at 50% = 1.0(Z] + RJ(C,„ + C fJ J, 



(14) 



Table IV shows values for the maximum data rate obtain- 
able and the corresponding delay for the 50% point. Also 
shown is the obtainable accuracy assuming a variation of 
1 pF tor [he capacitance. 

Table IV 
Low-Pass Filter Effects on Drive Signal 
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Fig. 13. Transformation of a de- 
sired load to resistive divider pa- 
rameters. 



64 HEWLETT-PACKARD JOURNAL DECEMBER 1983 



)Copr. 1949-1998 Hewlett-Packard Co. 



CMOS Device Measurement Results 



HCMOS Example 

+ shows the signal obtained at the HP 82000 tester com- 
parator input from an HCMOS output Switching charact- 

. = 4 5V. T a - 25X ac>tance C . = 50 pF) are 

transition time ** 8 ns 3 propagation delay « 38 ns 

For comparison. Rg 2 shows the signal obtained w 
same output connected to an open-ended trans miss ton line. Sig- 
nificant influences are introducea by ine transmission fine 
ran mem 
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Frg« 1* Shmoo plot of comparator input signal from an 
HCMOS output buffer with 4-mA source/sink capability. 

loaded by a resistive divider with parameters ft s = 200 ohms. 
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Fig. 2. Shrnoo piot of comparator input signal from an 
HCMOS output buffer with 4-mA source? sink capability. 
loaded by an open-ended transmission fine (t#j - 3 ns) with 
the comparator in htgh-Z mode (lumped capacitance = 23 
pF) 



CMOS 14000 Family Example 

Frg 3 shows the signal obtained at the comparator input from 
a CMOS 14000 far Switching characteristics (at 

>5X C - 50 pFi are transition time 
ns + i 35 ns/pR propagation delay ^80 ns pF To 

show the comparator's sensitivity the waveform is not back-cal- 
cuJated according to equation 7 of the accompanying 
(that is. V Cfnp is shown, not V Q ) Such a calculation would result 
Lies of 3.97V for the high level and l l4Vfor the low level. 

For comparison, Fig 4 shows the signal obtameo w 
same output connected to an open-ended transmission line. Be- 
cause of the slow transitions . the transmission line acts as capact- 
tive loading 
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Fig. 3. CMOS MCU000 family output signal (V nh = 2,5V at 
l, !h = 2 1 mA, V^ = 0.4V at f oS = 44 mA) with resistive d'h 
vider parameters R s = 1000(1 V, 
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(lumped capacitance = 23 pF, resulting m a load of 50 pF 

total). 
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Typical: 
T A = 25X 



V„ a = 6.0V* 
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Fig. 14. Loading resulting from modifying V r 
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Fig, 15. Comparator overdrive as a function of data rate 

DC Accuracy with Resistive Divider 

For ease of use, the testers software takes care of the 
appropriate calculations of the user's comparator thresh- 
olds. This is done using equation 7. 



Capacitive Loading 
at 50 pF 




Resistive Divider 
at 10 pF + 10 mW 



1000 



1 10 100 

f {MHz) 

Fig. 16. Device under test power dissipation as a function 
of frequency for capacitive loading afSQpF and for a resistive 
divider with W-mW dc loading + 10-pF capacitance 



Thinking in terms of the nonconipressed thresholds will 
ailed the dc accuracy. There are four sources of error in 
setting the desired comparison threshold: 

■ Termination source error: dV t (rnV'i 

■ Comparator threshold error: dV lh (mV) 
* ToJerapp m K s : <1R, [%) 

■ Tolerance on R t : dR t (%), 

The total accuracy for the desired threshold [dV p j can 
be calculated as; 

dV p = rdV [h -(r-l]dV ( + (l- l/rHV^-VOtd^-dRJ [15) 

where r is the divide factor: r-[R s + RJ/R v 

Using 1% resistors and assuming 10-mV basic accuracy 
for I be threshold and termination voltages, an accuracy 
dV p a£ 2r(10 mV) can be obtained. 

DC Measurement Capability 

When the loading on the DUT pin matches the dc specifi- 
cations, further ianoul measurements are not necessary* 
bul can be made anyway. The presence of R 5 will cause a 
voltage drop when a load current I ( is forced at the output 
(Fig. 17). It is necessary to consider the drop when program- 



R, 



^A^^r-« 
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-vvv T 



f=tn> 




Fig. 17. Dc measurement path using the resistive divider 
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ming the compliance voltage of the tester's parametric mea- 
surement units |P\iU}, Since R^and the forced current are 
known, the actual output level can easily be calculated 
with sufficient accuracv. It is: 



* out * measure 



V drop = Vmeasurt " Kh U 6 1 



For best results, 0.1% resistors are recommended for R^. 

Summary 

In the HP 82000 1C Evaluation System, the resistive di- 
vider method offers advantages in operating speed and 



measurement accuracy. The method has its restrictions and 
does not ensure testability of every- DVT. 
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Memory board, 16M-byte ...... 25/June 

Merge program .„,,.....,..,. , 77/Oct. 

Merge system, Starbase/Xll .,.,..„.., 6/Dec 

Messages and methods 19,89/Aug. 

Microwave extender output section 

, 49/Oct. 

Microwave signal generators 14/Qcl, 

Millimeter-wave analysis 8/Feb. 

Mixer/detector ............... 8/Feb, 

Model, electromigration .„..„.« 79/June 

Models, FET 56/Oct. 

Models, termination , 59/Dec. 

Modular instrument systems „...„ 91/Apr. 

Modular signal generators 14/Oct 

Modulation transfer function, 

lightwave 36,41/June 

Modulator, pulse 54/Ocl 



MOMA (multiple, obscurable. 

movable, and accelerated 

windows „ ..„,.,..,.. 11, 25/Dec, 

MPL source access system 50/Dec, 

M.S-DOS objects 28/Aug. 

Multifunction synthesizer 52/Feb, 

Muhimeter, 8^-digit fi/Apr. 

Multislope rundown ... 9/Apr. 

Multislope runup 10/Apr, 



N 

Network, voice and data 42/Feb. 

Neural data structures , B9/June 

Neuron programming .... .... 69/June 

NewWave agent 32/ Aug. 

NewWave application program 

interface (API) 32/Aug. 

NewWave computer-based training 
(CBTJ 48/Aug, 

NewWave encapsulation 5 7/ Aug. 

NewWave environment. 

overview ♦ .,,..... 6/Aug. 

NewWave help facility 43/Aug. 

NewWave object management 
facility (OMF) .... ,, 17/Aug. 

NewWave Office ......... ,...,... 23/Aug. 

NewWave windows 23/Aug. 

N-flops 70/June 

NMOS-IIJ chip ,..,.,.* 62/Feb. 

Noise, ADC ,„,., 13/Apr. 

Noise floor, optical measure- 
ments; 49/june 

Noise, signal generator 27/Oct. 

Numeric data parser « 70/Feb. 

Nusseit analog 82/Dec. 

o 

Object-based user interface 9/Aug, 

Object class 18,91/Aug. 

Object encapsulation ,,. 89/ Aug. 

Object life cycle „ 19/Aug. 

Object links 10, 18/ Aug. 

Object management Facility 17/Aug. 

Object model 11/Aug. 

Object models and views 94/Aug. 

Object module, HF^UX „. ., 78/Oct 

Object-oriented 69,8B/Apr. 

Object-oriented language 93/Aug. 

Object-oriented life cycle 98/Aug. 

Object-oriented technology ......... 87/Aug. 

Object properties 18/Aug. 

Object-relationship diagrams ...... 87 Apr. 

Objective-C 95/Aug, 

Objects 7u\RG/Apr. 

Objects, graphic ._......,.. 13/Dec 

Objects, NewWave 9,17/Aug. 

Office metaphor ,,... 12/Aug. 

Office, NewWave 23/Aug. 

Offscreen memory ll>15.Dec. 

Offset errors ...„.< 23/Apr. 

Otets calibration , 25/Apr. 

On-the-fly counter readings 33/Feb. 

Optical device measurements 42/June 

Optical frequency -domain 
reflectometry 43/June 
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Optical reflection measurements 

42 June 

Optical time-domain 

reflectometry * i June 

Optica 1-to-electrica] device 

measurements 36]une 

Optoelectronic erase bar ...„.„„..,.. 1 < 
Oscillator, fast hopping i4/Qct. 

Oscillator YlG-twned 39/Oct. 

Outgusrd section , 31 Apr, 

Output system^ signal generator ..42 

Overlay planes 11.33-Def:. 

Oxide passivation 7n/Qct. 



"Packageless" microcircuits .. 44 Oct 

Packets 43 Feb. 

Parser, command 22/Gct. 

Partially reflective light guides .... QK Ot t. 

Passivation, pbotodetectors &9 ' 

PC 'CD-ROM source access system . 50/Dec. 

P code 39/Aug. 

Peai detector 

Performance signal generators 14 He A 

Phase digitizing : 28/Feb, 

Phase-locked binary refen mi. - 

frequency , 6B'Feb- 

Phase-locked loop 27.-4 " 

Phase noise ....... 27,39/Gct. 

Phase progression plot 3u7Feb. 

Phase modulation 59/Feb. 

Pbotodetectors, pin r high-speed ,. 56/June 

Photodetector processing 69/Qcl. 

Photodiode measurements 42/June 

Pin pbotodetectors 56/1 une 

Pipeline, graphics 74/Dei . 

Pixel cache 7^/Dec 

Pixel processor 77/Dec. 

Pixel value „, l%IBec< 

Pixmap 11/06G. 

Platform definition ... HhOtt. 

Pointers, updating 7!i n. I 

Polymorphism 9n/Aug. 

Port/HP-UX fPORT/RX) ,. fiS/Oct- 

Power compression measurements > 

laser 41/|une 

Precision Architecture computer, 

tnidrange >..♦.♦.,♦.„ ,..,.. IB/June 

Precision Architecture, HP-UX 

shared libraries 86/Oct, 

PremastariiiR 54, Dm;, 

Processor board, midrange 

computer 19/June 

Program faults 51/Apr, 

Programming with neurons .... 69,72/Jmie 

Progressive refinement 86/Dec. 

Pulse modulation system 51/Ocl. 

Pulse modulator 1C 56/OcL 



Quarter- inch cartridge tape drive . 67; A tig. 
Query/debug l7/]une 



Radios! ty *... 
Ray tracing . 



7y/Dec. 
7BT)e(.. 



Reading stcirage. multimeter 17 

Receivers, lightwave - 52 June 

Real-time data base 6 June 

Real-time firmware 79 Aug, 

Recognising code quality 65 

Reference frequency 

Reference voltage -H Apr 

Reflection i n i igh t gu . . 98 Ot t 

Reflection measurements, 

opli: 4^ 

sensitivity measure- 

ments, laser 41 [m*e 

Reflectivity, dielectric 980ct. 

Refrat 91 

Rel lability, tape drive „ 74 Aug. 

Reliability, IC 79'June 

Reliability, software 75, Apr 

Rendering - 11 "Dec 

Resistive divider, IC testing ' 

Resolution, ADC H..*»/Apr 

RKspunsivity. electrooptical 

device ..... 40 |une 

Result objects ,.. 99 Aug. 

Return loss measurements* 

optical 44 fiute 

Reusability 63/Apr. 

Reverse power protection 50/Oct. 

RF signal generator ,„.,.,.... 14/Oct. 

RFK signal generator 59/0ct. 

Routing, network ... 47/Feb. 



Sampling .^- 9/Feb. 

Sampling) equivalent time IB.Apr. 

SA/.SD and design process ..«« 54/ Apr. 

Scan conversion » -- 75/Dec. 

Scan paths .„,♦ , G4/Feb, 

Semaphores 16 June,17/Dec. 

Sequencer it: ......... 38/Feb. 

Shared libraries, HP-UX ,,. 86/Oct. 

Shared memory 86/Oct, 1 1,1 2/Dec. 

Sharing cursors 27/Dec. 

Sharing fonts -™- 27/Dec. 

Sharing objects ........ lfi/Aug.,1 4 -h. ■■ 

Sharing tin? color map ...,..,. 2H/IV*.. 

Signal generators 14/Oct. 

Signal handling, shared libraries . 58/Oct 

Signature analysis ..., „ 62/fune 

Simulation* electro migration 79/June 

Single-loop frequency synthesis 

16 ( 39/Oi:l. 

Slope responsivily 4U/Jnne 

Slot Module 93,96/Apr, 

Snapshots 21/Aug. 

Software defect analysis 50/Apr, 

Software defect causes * ;>•-. 59/Apr. 

Software defect data collection ... 577 'Apr. 

Software defect perspective 57/Apr, 

Software defect prevention 64/ Apr. 

Software defect data validation .. 58/Apr, 

Software defect types — .. 62/Apr. 

Software failure rate ,„♦„.,....... 75/Apr. 

Software process improvement ... ti5/Apr. 

Software productivity 81/Apr, 

Software release goals . ♦ 77/Apr, 

Software reliability 75/Apr. 

Software test tool SSflimS 



Solder joint inspection 81 Oct 

Source code access system .......... 50 Dec, 

Source code, lack of ... 76 Oct. 

Sources, lightwave 52 lune 

Spectra, lead vibration 83 Oct. 

S PL- s. HP Precision Architecture . 18 Tune 
SRX graphics subsystem "4 Dec 

.... 34 Dei: 
Starbase 

State net , 83 

State transition diagram ... 80 Aug. 

Storyboard techniques 95 Oct. 

Strip file ..„.„. 77 

Strip program ., ....*...,„. 7" 

Structured testing 83 Aug, 

Structured analysis and 

structured design «. 54.80/Apr, 

Structured methods » , 79/Aug, 

Subsampling, synchronous ......... 1> 

Substructuring #4 LK ■ 

Super-blocking „ 32/June 

Surface mounl Leads, unsoldered . Bl/Oct. 
Surface mount process, 

double-sided 2 3 /June 

Switching engine 44.'Feb. 

Symbolic debug, driver » 76/Ocl- 

Synthesized signal generators 14/Oct. 

System analysis 86/Apr. 



Tape cartridge mechanics 69/Aug. 

Tape drive, Winch B7/Aug. 

Tape drive, data compression ..... Zti/June 

TapH head wear , 74/Aug. 

Task automation 34/Aug. 

Task language, agent 35..'JB .Aug. 

Task language parser 40/Aug. 

Tearmuild engine 44/Feb. 

Temperature distribution, 

thin-Film 81/june 

Termination models, IC test ,♦, 59/Dea 

Test plan „ ,... 72/Apr. 

Test process , 71/Apr, 

Test script 5Mime 

Testing, Starbase/Xtl Merge 42/Det:. 

Thermal Lontrol, laser ..»*,...„„..»... 52/june 

Throughput, multimeter 31/ Apr, 

Time interval] analy/.er 6/Feb, 

Time to failure, thin metal 

lines . ■ Hivjum- 

Time variation display I 

Tokens 21/Oct, 

Track density 70/ Aug. 

Track-and-hold circuit 19/Apr. 

Trnck seeking 72/Aug, 

! i nisform engine 75/Dec. 

Transform, time-domain 38/June 

Transmission line ei'lf-i I -■. 

IC testing 58/Dec, 

Transparency 75.77/Dec. 

Traveling salesman problem 7T>'Juim 

Trigger circuit 24/lVrb. 

Transparent color 37/Dec. 

Troubleshooting, IIP 3000 50/Dec, 

Tuned dipole antenna ^-„»,..-i n .Mi 62/Oct. 
Tuples ,♦.♦.. 7/June 
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Turbo SRX graphics 

subsystem , 12,74/Dec. 

u 

HART. custom 36/Apr. 

Unit testing ... 69/ Apr. 

Unsoldered leads, surface mount , 81.' Oct. 
User-centered application 

definition ... 90/Oct. 

V 

Vectored interrupts „ 73/Feb. 

VCO, fast hopping . ,„■„.„ 34/Oct. 

VCO r VTG-tuned 39/Oct. 

Vibration spectrum, SMT leads „. 83/GcL 

Vibromelry, laser BZ'Ocl. 

Video feedthrough 58/OcL 

Virlunscape 58/june 

Video signature analyzer .. 62/June 



Views ... ZO/Aug. 

Virtual circuits .... 4;i h h 

Virtual instruments 98 Aug 

VISTA 93/Aug. 

Visual type 12.35/Dec. 

VLSI, graphics 74/dec 

VdiQB Sad data network »..„ 42/Feb, 

Void form hi it jii. electro- 
migration 81 .'June 

Voltage reference, high -stability . 28/Apr. 

VMEbus ..,....,..,. 91/Apr, 

Vscope ........ .....,....,.,.,,.,.,.. Ti'J lnnr 

VXIbus .. 91/Apr. 

VXIbus development tools 9fe/&pr< 

W 

Waveform analysis library 4 7/ Apr. 

Wave impedance . 64/Oct. 



Windows. NewWave , 23/ Aug. 

WYSIWYG 10/Aug, 

X 

X driver interface (XDI) 9 t 12/Dec. 

XI 1 R Due. 

X s-tMt , 6,12/Det;. 

X Window System H/Der, 



VICi-tutu:ii oscillator .... 



39/Oct. 



z 

Z-buffer 75 h 76/Dec. 

Z-cache . 76/Dec. 

Zero-dead-lime counters .. ifUJ3/Feb. 



PART 3: Product Index 



HPE1400A VXIbus Mainframe Apr. 

UP E14U4A VXIbus Slol Module ... Apr. 

HPE1490A VXIbus Breadboard Module - - Apr. 

HP E1495A VXIbus Development Software * . Apr. 

HP 300 fl Stsriii.s 9&3 Computer |unt* 

HP 34 5 8 A Multimeter Apr, 

HP 5364 A Microwave Mixer/Detector ., F«b- 

HP 5371A Frequency and Time Interval Analvzt.:r Feb. 

HP -qanxc: Tap- tifive [u&e 

HP Bf>44A Synthesized Signal Generator ■ Oct. 

HP 8645A Agile Signal Generator , Oct. 

HP 8665A Synthesized Signal Gfm^rntnr Oct. 

HP 8 70S A Lightwave Component Analyzer + June 

ilP 8904 A Multifunction Synthesizer „ Feb. 

HP anon Model 835 Computer .,„ June 

HP flOOD Series 300 /son Turbo SRX 3D Graphics Subsystem . Dec. 
HP 9145A v'4-lnch Cartridge Tape Drive Aug. 



HP U889A RF Interface Kit , faae 

HP HHHuA Lightwave Coupler June 

HP 11391A Lightwave Cciuplrr , June 

IIP 82000 IC Evaluation Sysiimi Dec. 

HP 83400A Lightwave Source ....,.„. „ June 

HP 83401A Lightwave Source .. June 

HP 83402A Lightwave Source lune 

HP 8 3403 A Lightwave Source flltte 

HP83410R Lightwave Receiver June 

HP 83411 A Lightwave Receiver , Junu 

HP 98646A VMEbus Interface Apr. 

HP NewWave Environment . — Aug. 

HP Real-Time Data Base , June 

HP S tarb as e Graphics Li bra rv . Dec. 

HP VISTA .„, Aug. 

X Window System Version 11 .................. Dec. 



PART 4: Author Index 



Aklyama, Tadashi Apr. 

Afbin, Robert D- June 

Andersen, Brad E, ..... Oct. 

Andreas. fames R Dec. 

Barnes, James O Feb. 

Bartlett, Paul F. Aug 

Berlin, Lucy M ...,...,.„,..,_.. Oi 1. 

Bender- Dale R. „ , Feb. 

BianchL Mark J. .. June 



Bt.Riwski. Donald T Feb., Oct. 

Boyton, Jeff R Dec. 

Brockmann, Russell C June 

Bronstein, Kenneth H. Dec. 

Brown. John M ... Dec. 

Burgoon, David A. Dec. 

Cathell. B, David Dec. 

Ceely, Gary A, Apr. 



Chakrabarti, Sankar L. Dec. 

Chu. David C Feb- 

Uim\ Robert C. .„„♦,.,....„.. Dec. 

Coackley. Robert Fph, 

Conrad. Geraldine A. June 

Grow, William M Aug. 

Curtis. G. Stephen Oct, 

Czenkusch, David A. . Apr. 
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\itthony J- ... 
Dysart, John A. .... 

:rian B. ...... 

Elliot, Ian A. ........ 



Aug, 

June 
Apr. 
Apr. 

Oct 
Oct. 



Fa I eh i. Fey 7 

;: 
ter. William A., I; 
*r Cathy 
Fri«?d. Steve R. 

': L 
Fuller Ian J, . Aug, 

Giem. John ... . v ; « r 

Dec. 

Gills, David Aug. 

Giveits. Cynthia . . June 

Goeke. Wayne C Apr. 

Grady, ftobeil B . Apr. 

Hams, Tracey A ....... Aug. 

Hanson, Scott A Aug. 

Jeffrey G June 

Hart, Michael G. . June 

Heikes, Craig A. . Feb. 

Heinzl, Johann J Feb. 

Helmsa, Bennie E ..,...., Oct. 

Herleikson, Farl C 0i I 

Hemday, Paul ... [iuie 

Hiebeit, Steven P .. D&c. 

Higgins, Thomas M., Jr. Feb. 

Hn. Donna ..... - Apr. 

Hong. !.►■ 1 June 

Hoover, David M .. Oct, 

Ives. Fred H ....... ... ..... Feb, 

Jencek, John J Aug. 

Jeflsen, Kenne-th \[i 

Jones, Carnhn F I h\ 

ImsI |.nri'' L - W \|'i 

kiilstnn. Michael B Dan. 

Kanago, Kerwin D Oct. 

Kato, Jeffery J. ........ June 

Key I v. Catherine A. . ,...., Oct 



Keller, John ... 
Kraemer, Thomas F. 
Krugen Gregory A. - 
Kurtz. Barry D 

I -am , Beatr 
Lang. John J. ... 
Leath, Charles i 
Leyde. Kent W 



June 
Aug, 
Apr, 
Apr. 

Aug. 
, Dec 
. Oct. 

June 



Light. Michael K June 

Liu. Ching-Chao June 

Loomis. Courtney Dec, 

. June 

-nner , Lawrence A* .. Aug. 

Marchington. Keith A. ...... Dec 

Marcoux. Paul J .. June 

Martelli, Anastas 

McCabe, Thomas J ...„.„ . Apr, 

MtCurmick Alan J Feb. 

Mc|unkin. Barton L. Oct. 

MtNamee, Michael D. ,. Oct. 

Men bant, Paul P June 

Mayer, Thomas O. June 

Moore, Floyd E June 

Nakajn, Takeshi Apr. 

Xanulitskv. Vladimir Iutih 

NimorL Torrance K. . 

Owen. Jens R Dec. 

Packard, Barbara B ♦ fttlg 

Pearce. Stephen J Dec. 

Platl. David L. Oct, 

Plitschka, Rainer Dec. 

ft, Rollin R .,, June 

Render, WulF D June 

Robinson. Paul F . Aug, 

Robinson, Peter R - Dec, 

Rustic i. David I Apr, 

Sachs, George M ... Dec. 

Sasabuchi, Katsuhiko Apr, 

Schneider, Richard ....„.,.„„„ Feb, 



:rtz< David J. ... Feb, 

Shackleiord, J. Barry June 

Shaughnessy. Kenneth W 

Showman. Peter S Aug. 

Simms. Mark [ Aue 

Sims, John M Oct. 

Sloan June, Oct. 

Smith, David E Apr 

Snook. Do .- 
Spilman. Vicky 
Stambaugh, Lisa B 

Stambaugh, Mark A Oct. 

Steadman, 

Stearns.. Glenn R. 

Stephenson. Paul S- ......... Feb. 

Steven Scott D Apr. 

Stroyan, Michael H Dec. 

Summers, James R 

Sweelser, David J. , Dec. 

Sweetser, Victoria K Apr, 

Swerlein, Ronald U Apr. 

Talbot. Mark D Feb 

Tanner, Eve M. .. pel, 

Thayer. Larry J Dec. 

Thompson, Kenneth S Feb. 

Topham, Andrew D ................. Aug. 

Tuttle. Myron R , June 

Van Maren, David J June 

Venzke, Stephen B , Apr, 

Vogen, Andy June 

luhii A Dec. 

Wall, Teresa A ., Apr, 

Ward, William T Apr. 

Walking, Brian D Oct, 

Watson, R. Thomas Aug. 

UV< hhler. Mark F«&. 

Wheian, Charles H Aug. 

Whin, Eugene J Aug, 

Wong, Roger W. June 

Wright. Larry R Feb.. ( N I 

Wright, Michael J ,..,„ ..,...,. June 

Voder, William R. Dec 
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Custom VLSI in the 3D Graphics Pipeline 

VLSI transform engine, z-cache t and pixel processor chips 
widen bottlenecks in the pipeline to allow the HP 9000 Series 
300 and 800 TurboSRX graphics subsystem to deliver 
enhanced performance compared to the earlier SRX 
design. 

by Larry J. Thayer 



PRODUCTS FOR DISPLAYING 3D GRAPHICS on en- 
gineering workstations have been appearing at an 
ever-increasing rate over the last few years. Products 
of each succeeding generation are much more interactive 
and have significantly more capabilities than earlier o 
Fueling the fast-paced change are new algorithms, bettei 
architectures, and most itn pert ant. advances in VLSI [vhiv 
large-scale integration) processing and design. 

Within HP, perhaps the first use of a custom VLSI chip 
for computer graphics applications was in a graphics dis- 
play for a desktop computer introduced in 1981. The chip 
accelerated vector drawing on HP 9845B Computer dis- 
plays. Our first 3D product, I he HP 98 7 00 A, introduced in 
1985, drew fast wireframe images with the aid of special 
commercially available video RAM chips. These chips al- 
lowed the raster display to he refreshed at the same time 
the image was changing. 

HP's first solids modeling graphics subsystem, the HP 
9000 Series 300 and 800 SRX, was introduced in 19ftfi. It 
uses a proprietary HP process (NMOS-JIIJ to build chips 
for floating-point operations (essential for I, is! 3D graphics) 
and for the scan conversion process (polygon and vector 
drawing). T Another proprietary process (LTCMOS) is used 
for a chip that caches pixels, thus allowing multiple pixels 
to be changed per RAM cycle* 2 The upgrade system for the 
SRX, the TurboSRX, introduced in 1988, uses even more 
VLSI for increased performance and functionality. 

Custom VLSI is the technology of choice for producing 
interactive 3D graphics for several reasons: 
■ VLSI devices are a capable source of the very high com- 



putation rates needed lor fast, interactive graphics. [The 
scan converter chip used in both the SRX and the Tur- 
boSRX is capable of performing over 300 million addi- 
tions per second.] 

■ Data flow is pipelined, with each point in the pipeline 
having a particular function, VLSI chips can be tailored 
to each function. 

■ The low-cost potential provided by large-scale integra- 
tion makes interactive Al) graphics capability available 
in a workstation that an engineer can afford. 

This article describes how the 3D graphics pipeline of 
the SRX was analyzed, and how custom VLSI was used in 
Ihe next -generation product, the TurboSRX, to improve the 
overall graphics performance. 

Pipeline Stages 

Graphics workstations contain a data pipeline for dis- 
playing user graphics data bases (see Fig. 1). The source 
data is stored in the host system memory, typically in a 
display lisl format This list is simply a file containing a 
hierarchical list of the graphics primitives needed to draw 
the image. First in the pipeline is the system CPU, wlm h 
reads the display list and sends commands to the graphics 
subsystem. Using the main system CPU for display list 
processing minimizes system cost and allows the size of 
the display list to be limited only by the virtual memory 
space of the processor. 

Next in the graphics pipeline is the transform engine 
block, which resides in the graphics subsystem and consists 
'I Miie or more microcodable processors (called transform 



♦- - *^gl!:T>^g ~ >^^ ' ^ *B-*H 



Host SPU Graphics Subsystem 



Fig, 1 . The 3D graphics pipeline 

m the HP 9000 Series 300 and 800 
TurboSRX graphics subsystem 
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engines). The transform engine block performs matrix mul- 
tiply calculations for positioning the image in tfaree-di men * 
sional space, clips the image to the viewing window, cal- 
culates polygon vertices for parametric surface commands, 
and applies lighting calculations for realism. 

When the transform engines have finished all necessary 
calculations, they send the polygon and vector endpoints 
(in integer device coordinates) to the scan converter The 
function of the scan converter is to draw the individual 
polygons and vectors into the frame buffer where they can 
be viewed, In the scan conversion process, each pixel in 
the polygon is calculated individually to determine its x T 
y. z, red, green, and blue values. The x and y values deter- 
mine the pixel's location on the screen, the color values 
allow smooth shading of colors, and the z values are sent 
to the z-buffer for hid den -surface removal. 

After the pixels have been calculated, a dither circuit 
operates on the color values to provide a greater number 
of apparent colors, thus allowing true-color images with 
as few as eight graphics planes. (When 24 planes of frame 
buffer memory are available, dithering is not used.) Trans- 
parency is implemented by drawing alternate pixels of the 
transparent surface, a technique known as "screen door 
transparency. 11 The technique gets its name from the screen- 
door-1 ike pattern used to determine which pixels to draw, 

Z-buffering is a general-purpose approach to hidden-sur- 
face removal. The z-buffer is simply RAM in which 16 bits 
are allocated for each pixel on I he screen. It works by com- 
paring the z value [depth) of the pixel being drawn to that 
of the pixel already present at that location, if any. If the 
new pixel is closer, it is drawn to the frame buffer and the 
z value is updated to that of the pixel being drawn. If it is 
farther away, the pixel is not drawn and the z-buffer is not 
updated. 
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Comparative Performance 

Because the SRX was the first product of its caliber, there 
were many unknowns about how the product would be 
used and how it would perform. Since then, much has 
been learned from our customers and from our own 
analyses about what features are commonly used and what 
sizes of polygons are typically drawn. For the purpose of 
illustration, we wijl examine two kinds of polygons: small 
polygons (defined as being 20 x 20 -pixel unconnected 
quadrilaterals) and large polygons (defined as being larger 
lhan 200^200 pixels). The performance metric for small 
polygons is polygons per second, and large polygons are 
measured in pixels drawn per second. 

Figs, 2 and 3 show the relative performance of different 
stages in the pipeline. It is important to keep in mind that 
since the graphics architecture is organized as a pipeline, 
the performance of the system is determined by the slowest 
block in the sequence. Note that for small polygons the 
transform engine block limits the performance on the 3RX K 
with the z-buffer being the next limiter. For large polygons, 
the z-buffer is the primary culprit, but the dither transpar- 
ency circuit is right behind. 

It was clear from examining the data that to improve 
performance significantly for both cases , it would be neces- 
sary to change more than one functional block. 

Transform Engine 

Each transform engine consists of a microcodable proces- 
sor and floating-point chips, (In both the SRX and the Tur- 
boSRX, NMOS-in floating-point chips are used.) Because 
of the many intricate, sophisticated algorithms necessary, 
it was decided that for the TurboSRX this function should 
be implemented in the same general-purpose fashion as Ln 
the SRX, The approach taken was to use multiple higher- 
speed transform engines to gain performance. Product 
packaging limitations prevented a faster discrete Imple- 
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Fig, 2. Relative performance of 3D graphics pipeline stages 
for small polygons. 



Fig. 3. Relative performance of 3D graphics pipeline stages 
for large polygons 



DECEMBER 1939 HEWLETT-PACKARD JOURNAL 75 



)Copr. 1949-1998 Hewlett-Packard Co. 



mentation using bit-si ire hardware, so an NMOS-III VLSI 
chip was designed to unable three improved transform en- 
gines to fit into the product. II was dubhed TREIS, w fin h 
stands for TRausform Engine In Silicon. Integration pro- 
vides both reduced size and increased performance. 

Each transform engine contains the full set of microcode, 
so any transform engine can execute any graphics opera- 
tion. One transform engine acts as the master, distributing 
graphics comma nds among the three transform epgihes 
Any command can therefore he distributed to the next free 
transform engine, including the master. 

The result is more than a threefold gain in the raw 
hardware performance in the transform engine stage of the 
pipeline for small polygons (see Fig, 2], By adding im- 
proved microcode and software and some higher-level 
functions, performance levels up to ten times that of the 
SRX can be achieved. One higher-level function, quadri- 
lateral mesh T allows the vertices of adjacent quadrilaterals 
to be transformed, clipped, and lighted a single time, result- 
ing in a net reduction of processing by almost a factor of 
four, 

TREIS (see Pig r 4] is a custom NMOS-IU chip containing 
about 170,000 transistors, including 1536 bytes of pointer 
RAM and an ALU, in a 272-pin pin-grid array (PGA) pack- 
age. It outputs a 16-bit microcode address and reads a 68-bit 
wide microcode word with highly pipelined architecture, 
It improves performance over the SRX transform engine 
by combining some two-state activities into one state. Like 
the SRX transform engine, it connects to HP-proprietary 
floating-point math chips through a 32-bit floating-point 
bus for accelerated transformation, clipping, lighting, and 
parametric surface calculations, The connection to the 
polygon-rendering chip is through a double-buffered RAM 
containing polygon and vet: tor vertex addresses, % values, 
and color data. 



Z-Buffer 

( )nce the transform -engine bottleneck was improved, the 
in*\l performance limitation for small polygons was the 
Speed \>\ the /-buffer. The SRX's z-buffer is in the non- 
displayed part of the frame buffer. (The frame buffer holds 
2048 X 1024 pixels, but only 1280 x 1024 can be displayed 
at a time. Most of what is not displayed can be used as a 
z-buffer.) While this approach minimizes the cost of low- 
end systems, maximum performance cannot be obtained 
when frame buffer and z-bufter accesses cannot be done at 
the same time* 

When drawing with the z-buffer enabled, the SRX musl 
read the z value from the frame buffer, compare the z value 
pi each pixel with the z value present at that location, write 
the new z value back into the frame buffer if necessary, 
and write the pixels into the frame buffer if necessary, 
l-sin^ pixel caching allows each access to handle up to 
eight pixels (the size of a frame buffer "tile") simultane- 
ously. 

Most of the z-buffer overhead was eliminated by provid- 
ing an optional dedicated z-buffer, which allows z-bul'fer 
RAM cycles and frame buffer RAM cycles to occur in paral- 
lel. In this dedicated z-buffer is another custom chip, the 
z-cache, which allows multiple z values to be fetched and 
stored in a single RAM cycle, increases the tile size, and 
performs comparisons of z values at a rate twice as fast as 
the SRX. 

The z-cache is an LTGMOS standard -cell design contain- 
ing about 370K3 gates, packaged in a 68-pin plastic leaded 
chip carrier. It is similar in design and size to the pixel 
cache." It performs fast z comparisons and allows multiple 
z-buffer operations to take place in a single RAM cycle. 
One chip per plane is used in the x-bulier. 




■ ■ ■ 




Fig. 4, TREiS (TRansform Engine in Silicon) chip. 



Fig, 5, P/xe/ processor chip. 
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Pixel Processor 

The z-cacbe chip is still not enough to prevent the z-buf- 
fer from limiting overall performance, particularly for large 
polygons. 

In the SRX. whenever a new pixel needs to be written 
into a tile other than the one accessed by the previous tile. 
the polygon-rendering chip is held from drawing any more 
pixels until the new z tile is read. A third custom chip, 
the pixel processor, was added between the polygon-ren- 
dering chip and the z4juffer. it removes that latency by 
issuing an early warning when a new tile will be needed. 
This signal is provided far enough in advance of the pixel 
that the z values can be fetched from the x-buffer before 
the pixels are drawn- To match the output of the polygon- 
rendering chip with the z-buffer better, a FIFO buffer was 
added at the output of the pixel processor. This way. both 
the polygon-rendering chip and the z-buffer can operate 
more efficiently. 

The pixel processor (see Fig. 5] is a custom NMOS-III 
i hip containing about 11 0,000 transistors in a 168-pin PGA 
package. As mentioned earlier, it contains performance im- 
provement features such as the fast dither and transparency 
operations, the FIFO control, and the early z read signal 
to prevent slowing down the polygon-rendering chip. In 
addition, it contains three 1024-byte gamma-correction 
ROM tables for more accurate color representation, and 
window clipping operations for up to :\2 movable, obscur- 
able, overlapping, accelerated graphics windows. A 
pipeline valve inside the chip allows fast window opera- 
tions without emptying the graphics pipeline. All pixel 
operations inside the pixel processor are performed at the 
polygon-rendering chip's pixel output speed, so the 
gpttipMcs throughput does not slow down when using any 
of its featu 

Notice in Figs, 2 and :j that these /.-buffer enhancements 
improve that portion of the pipeline for small polygon per- 
formance U\ about 50% and for large polygons by a fa 

of three. 

Dithering and Transparency 
With z*buffer operations streamlined, there was one more 

stage in tin? pipeline left to be improved. Dithering and 
Iraiii in the SRX are performed with discrete TTL 



logic. While this does not show up as a performance Hmiter 
in the SRX because it is faster than the z-buffer |see Figs. 
2 and 3), it would have become the limiting factor in the 
TurboSRX with the fast z-buffer. Instead of leaving the 
dither and transparency circuits in TTL t it was decided to 
include those functions in the pixel processor. This both 
improves the dither. transparency performance by a factor 
of two for large polygons (Fig. 3). and improves the reliabil- 
ity and cost of the overall system 

Conclusions 

Figs, 2 and 3 reveal that no stage of the TurboSRX 
pipeline is Significantly slower than the others for either 
small or large polygons. Since the pipeline is fairly well 
balanced, it might appear that higher performance would 
reqnire that all parts of the pipeline be replaced, requiring 
a large amount of product development time and cost. How- 
ever, as VLSI technology improves, so does the potential 
improvement of 3D graphics subsystems. Several areas of 
VLSI technology have been improving lately, including 
speed, density, packaging, and design productivity. Fur- 
thermore, the experience gained on earlier products has 
pointed the way toward new and better algorithms and 
architectures, Future graphics products will clearly have 
to take advantage of these latest advances to meet growing 
omer expectations. 
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Global Illumination Modeling Using 
Radiosity 

Radios ity is a complementary method to ray tracing for 
global illumination modeling. HP 9000 TurboSRX graphics 
workstations now offer three illumination models; radiosity, 
ray tracing , and a local illumination modei 

by David A. Burgoon 



IN COMPUTER GRAPHICS image generation systems P 
an illumination model can be invoked locally or glob- 
ally- When invoked locally, only incident light from 
light sources and object orientation are considered in deter- 
mining the intensity of light reflected to the observer's eye. 
Invoked globally, the light that reaches an object by reflec- 
tion from or transmission through other objects in the scene 
environment is also considered. 

Local illumination models are popular because they pro- 
duce reasonably realistic rendering and can be computed 
at interactive rales using hardware acceleration techniques. 
Global models are usually used when rendering realism is 
of primary importance. Traditional global illumination 
modeling methods are extremely computationally inten- 
sive. As a result, interactivity is usually sacrificed for the 
sake of realism. 



One of the most familiar local illumination models is 
that of Phong, : Turner Whifted 2 enhanced the Phong model 
for global use in ray tracing by accounting for the light 
reflected or transmitted from other objects in the environ- 
ment. 

In the ray tracing procedure, an intersection tree is con- 
structed by tracing a ray from the observer's eye through 
each pixel into the environment. At each surface intersec ted 
by the ray r two branches are added to I he tree, representing 
the spawned reflected and transmitted rays. Each surface 
intersection is represented by a node in the tree. This pro- 
cess is repeated recursively, The final pixel intensity is 
determined by traversing the Iree starting with the leaves 
and working toward the root, computing the intensity con- 
tribution of each node using the illumination model. The 
final pixel intensity is the sum of all of these contributions. 




Fig. 1. These gears were gener- 
ated on the HP ME Series 30 mod- 
eling, design, and drafting system. 
The ray traced image was ren- 
dered using nonuniform rational 
B-sphnes. (t is a polygonal repre- 
sentation with 3084 poiygons and 
12 partial polygons. 
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Fig, 1 shows an example of a 3D image generated using 
ray tracing, 

Ray tracing is an important rendering method. It has 
produced some of the most realistic images ever seen to 
date. However, it is not without its deficiencies. 14 In the 
ray tracing method, realistic shadows are difficult to pro- 
duce. In particular, penumbras and shadow envelop*- - 
seldom seen in ray traced images. Most ray tracing Tender- 
ers produce sharp shadow boundaries only. 

Must ray tracing systems limit themselves to modeling 
point light sources, that is. light sources assumed to 
emit light that originates from a single point in space. Light 
sources whose emission comes from a finite area are not 
readily treated by the method. Only some of the more recent 
and exotic methods, such as distributed ray tracing and 
ray tracing with cones, attempt to deal wife this limitation 

The reflection models used in ray tra< Ing are usually 
empirical and approximate. They are often chosen based 
■ m subjective results rather than physical laws of energy 7 
equilibrium. This disallows the modeling of effects such 
as color bleeding, where diffuse reflection from one surface 
causes a soft colored shadow to be seen on another 

Another problem with ray tracing is that il is inherently 
slow. The computational expense of recursively tracing 
rays for each pixel on a screen with reasonable resolution 
(e.g., 1280 by 1024 pixels) can be prohibitive. Furthermore, 
since the scan conversion and global illumination model- 
ing functions are very tightly coupled, any hardware op- 
timized for scan conversion that may be available is not 
used. The view dependent nature of the ray tracing al- 
gorithm also detracts from the interactivity of the system 
employing it. Each change in the viewing hm information 
requires that the entire ray tracing process be repeated to 



render the new view, 

Perhaps the most fundamental flaw of the ray tracing 
method is that it limits itself to modeling intraen% r ironment 
reflections in the specular direction only. Global modeling 
of diffuse effects is ignored, 

Radiosity 

The radiosity method, introduced by Goral and others. 5 
corrects most of the above deficiencies, but at the expense 
.traducing some restrictions of its own. The method 
correctly models the interaction of light between reflecting 
surfaces if the surfaces are restricted to be perfectly diffuse. 
It replaces the constant ambient term in Phong's model 
with an accurate global model. Radiosity has a fundamental 
energy equilibrium basis, and is derived from methods used 
in thermal engineering, Fig, 2 shows a 3D image generated 
using the radiosity method. 

In the radiosity method, a [possibly hypothetical) enclo- 
sure is constructed around the environment to be rendered. 
The surfaces or walls of the enclosure completely define 
the illuminating environment. They consist of light sources 
and reflecting walls. One or more of the surfaces of the en- 
closure may be fictitious [e.g., an open window). Each of 
the surfaces is assumed to be an ideal diffuse reflector, an 
[deal diffuse emitter, or a combination of the two (Fig, 3], 

The radiosity method deals with the equilibrium of 
radiant energy within the enclosure. The light (or radiosity, 
which is measured as energy.'Ume/area] leaving a surface 
i is Bj. It consists of direct emission Ej from the surface 
plus the reflected portion of light arriving at the surface. 
The light arriving at i, H it is found by summing the contribu- 
tions from the other N- 1 surfaces, and from surface i if it 
i^ concave, Note that there is no need to treat the emitted 




Fig. 2. This radiosity (mage of a 
cathedra! with eight bays of win- 
dows and columns was done for 
two hays, and the remaining bays 
were generated by a step-and- 
repeai process It took 7 minutes 
of preprocessing on an HP 9000 
Mode! 350 to build the data base 
and subdivide the polygons (mesh- 
ing), and 12 minutes per step for 
40 steps (using progressive re- 
finement) to generate the image 
(5 minutes and 8 minutes, respec- 
tively, on an HP 9000 Model 370) 
There are 9916 polygons (14,316 
after meshing )> 26 area lights, and 
four point lights. 
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and reflected energy separately because they are both per- 
fectly diffuse and therefore indistinguishable to the ob- 
server. Unlike ray tracing, the history or direction of a ray 
is lost after reflection from a surface. 

The total radiosity leaving a given surface i is therefore 



Bi = E) + ftHji 



m 



E* = 



Pi 



where B s = radiosity of surfaced This is the total rate at 
which radiant energy leaves the surface in 
terms of energy per unit time per unit area 
(watts per square meter). 
rate of direct energy emission from surface i 
per unit time per unit area, 
reflectivity of surface i. This represents the 
fraction of incident light that is reflected 
back into the hemispherical space surround- 
ing surface i* 

incident radiant energy arriving at surface i 
per unit time per unit area (watts per square 
meter). 
Hi is the sum of all the light leaving the N surfaces of 
the enclosure that "see" surface L The fraction of the 
radiant energy leaving a surface j that impinges on surface 
i is specified by the form factor or configuration factor F^. 
The energy per unit time arriving at surface i is therefore 



H= 



HA = £ B s A f F Ms 



J = i 



[2\ 



Surface i 



(*) 




Surface ] 



where A* is the area of surface i. Dividing through by Aj 
we have 

3 - | H, <g (3) 

According to the reciprocal nature of form factors/' 

Therefore, H f is 



1 1 

Thus the radiosity at a surface i is 



B| = Ei + Pi 2 B,F . 



This may he rewritten as 



Bi - A 2 BjFjj - E„ 

i i 



IS) 



(6) 



m 



N Surfaces 



or h for i = 1. 



[1 -PtF,, -piF 12 ... -P!F 1N ] 



Considering all N surfaces i we have 

1-PiFn "Pi F ^ ■■ "Pif lN ~ 
-p 2 F 21 l-p z F^ ... "P2 F 2iN 



"PstFm -pfi¥ NZ .... 1-PnF nn 



B, 



Ei. (7b) 



■Br 



Bx 



•E t - 



(7c) 




iBjFjj (Total Impinging Energy 
per Unit Area) 



■*- E, {Emission) 

Pi lB f F tt (Total Reflected 
Energy per 
Unit Area) 



I B, (Radiosity) 



(b) 



Fig. 3. Radiosity relationships, (a) Copyright: © 7984 oy 
Gorai, Torrance, Greenberg, and Battarte. Used with permis- 
sion, (b) Copyright © 1986byGreenberg Used with permis- 



This system of N linear equations with N unknown val- 
ues Bj has parameters E s , p it and F [jf which must be known 
or calculated for each surface, The E t are nonzero for sur- 
faces that provide illumination to the enclosure. Such sur- 
faces could represent a diffuse area light source or panel, 
or the first reflection of a directional light source from a 
d iff use surface. If all of the Ej are zero, then there is no 
illumination and all of the Bi are zero. 

In general, the E { and p i are functions of the wavelength 
of the light. They are usually chosen to represent an average 
value over a bandwidth of radiation, typically red, green, 
and blue. Once the form factors are calculated, the above 
matrix equation is solved numerically for the B values for 
each of three sets of Ej and p i parameters, 

The above equation is we I] -suited to solution using an 
iterative Gauss-Siedel technique*" because it is diagonally 
dominant, that is, the sum of the absolute values of the 
nondiagonal coefficients in each row is less than the abso- 
lute value of the main diagonal term. The solution usually 
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converges in six to eight iterations. 

The aforementioned surfaces generally are not the same 
as the surfaces of the representation chosen for the geomet- 
ric model of the scene. For example, if objects are described 
using poi . ie polygons are usually subdivided into 

patches or elements (i.e.. smaller polygons). These patches 
he surfaces of th* 

Once the radiosity for each primary color for each patch 
has been found, it is mapped onto the vertices of its as- 
sociated polygon so that the vertex radio- or) values 
can be bill nearly interpolated across the polygon using 
either Gouraud shading' or object-space interpolation. A 
good way to do this is to set the radiosities at the vertices 
o! patches that are interior to a given polygon to the average 
erf the adjacent patch radiosities and then extrapolate out- 
ward to the polygon vertices. 

The process of image generation using the radiosity 
method can be summarized as follows: 

■ Take the input geometry and subdivide it into patches. 
m Calculate form factors. 

9 Solve for the B, for each primary color, 

■ Extrapolate the B^ to polygon vertices and render. 
Once the form factors are calculated, they need not be 

recalculated if colors fpj or light sources (E) change. Also. 
as long as the geometry of the objects remains static, 
dynamic views of the scene can be generated by merely 
rerendering. This can be highly interactive on a workstation 
mii b as the HP 9000 Model 835 TurboSRX T which has 
dedicated hardware optimized for polygon rendering. 

Form Factor Calculation 

We now consider tbe calculation of F lj( the fraction of 
the energy leaving surface i impinging on surface j (Fig. 4). 
Because our surfaces are assumed to be perfectly diffuse, 
the form factor is purely geometrical in nature. It depends 
only on the shape, size, position, and orientation of the 
participating surfaces. 

For nonoccluded environments, the form factor from one 
differentia] area |i) to another (j) is given fcrj 



* dA.dA, 



COS^ COS$j 

irr* 



dA,. 



m 



Integrating over area A r the form factor to a finite area or 
patch is 



' 



■ i 



cos^ cos6. 



dA.. 



Trr 



(9) 



The form factor between finite surfaces (patches) is defined 
as the area average and is thus 






k i\ 



ias4> 






U0) 



From the symmetry of this equation we can derhi 
reciprocal relationship given in equation 4. Some other 
important properties of form factors are: 
m From the law of conservation of energy: 



1 F„ = 1. 

I 1 



UD 



■ For any surface thai does not see itself (planar or convex): 
F, = 0, (12) 

The Hemicube Algorithm 

For occluded environ ments, equation 1 hecoc 






i r r 

A | J\ J\ 



ensfj^i cos^fc 



(HID) dAjdAi, (13) 



where the Boolean function HID takes on the value 1 or 
depending on whether dAj can see dA,. This double area 
integral is difficult to solve analytically for general cases. 
An area integral, which is a double integral itself, can be 
transformed via Stokes' theorem into a single contour inte- 
gral, which can then U evaluated numerically, but at con- 
siderable computational expense. Numerical approxima- 
tion techniques can provide a more efficient means to com- 
pute form Factors for general com [ilex environments. The 
hemicube algorithm 8 employs such a numerical method 
and also addresses how lo deal with the HID function. 



Normal j 




Fig. 4. Form factor geometry. Copyright © 1986 by Green- 
berg Used mth permission 



Inner Integral Approximation 

It the distance between the two patches i and j is large 
compared to their areas, and if they are m>i partially 
occluded from one another, the integrand of the inner in- 
tegral of equation 1 :i remains almost constant over the area 
A|. If we let K approximate the inner integral we have 



F U - F A i A i 



- f K 

A, J\, 



\. = K A i= K = 






COS^COSC^j 



idA=. (14| 



Thus finding a solution for the inner integral K. the differ- 
ential-area-to-finite-area form factor, equal ion 9. will pro- 
vide a good approximation for the form factor from patch 
to patch. It the patches are close together relative to their 
size, or if there is partial occlusion, the patches must be 
subdivided into smaller patches until equation 9 provides 
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a good approximation. 

The Nusselt Analog 

To see how to evaluate the form factor Integra] numeri- 
cally, Nusselt's geometric analog' 4 to the form factor integral 
is helpful (Fig, 5J. 

Each differential area patch has its own view of I he en- 
vironment, which is the hemisphere of directions sur- 
rounding its normal. For a finite area, the form facto j is 
equivalent lo the fraction of the circle (which is the base 
of the hemisphere of directions J covered by projecting the 
area onto the hemisphere and then orthographically down 
onto the circle. 

The easiest way to see that this analogy correctly de- 
scribes the form factor integral is to think of it in terms of 
solid angles and projected areas. The area of plane A that 
is seen by or projected onto plane B is the area of A times 
the cosine of the angle between the normals of the two 
planes, It is equal to the area of the shadow that A would 
cast onto B. The solid angle can he thought of as a general- 
ization Of the planar angles with which we are familiar. 
Recall thai a planar angle 0, measured in radians, is defined 
to he equal to the length of the arc subtended by the angle 
divided by the radius r of the circle containing the arc, Since 
the total circumference of a circle is 2 tit, there are 2tt radians 
in a circle. Similarly, a solid angle i*>, measured in stera- 
dians, is defined lo be the area subtended by the solid angle 
divided by the square ol the ratlins of the sphere containing 
the area, Since the total area of a sphere is 4-iTr^, there are 
4it steradians in a sphere (and 2tt steradians in a hemi- 
sphere). Stated another way. one ste radian subtends a unit 
area of a unit sphere. 

Now, returning to our inner integral approximation, 
equation 9, the solid angle din that subtends the infinites- 
imal area dA. is (see Fig, 6): 



Jj i \ 



(15) 



Normal ■■ 




Fig, 5* The Nusselt analog. The form factor is equal lo the 
fraction of the base of the hemisphere covered by the projec- 
tion. Copyright © 7985 by Cohen and Greenberg Used wttn 
permission. 



where Sj Aj - is the portion of the area of the sphere with 
radius r that is projected by dA } in the direction of r. Since 
dAj is infinitesimal I y small, this projected area is planar, 
and is given by cos ^dAj. Thus, dw is 



costfy dAj 



116] 



Now, returning to the unit hemisphere of the Nusselt 
analog, the area on the unit hemisphere projected by dAj is 

(solid anglelfradius^) = dtu(l 2 ) = diu = — — { i . (17) 

The projection of this area onto the base of the unit hemi- 
sphere is 



cos^jCos^dA- 
(cos^iJ(doj) = 1 — ■ L - 



(18] 



Taking the ratio of this area to the total area of the base of 
the unit hemisphere (tt) we have, as before, the differential 
form factor 



cos^cos^jdAj 



(19) 



Integrating these differential form factors over A^ and 
then taking the area average of this integral gives us the 
double area integral expressed in equation 10 for the form 
factor F|j. Using the inner integral as an approximation is 
equivalent to using the center point of patch l to represent 
the average position of patch i, constructing a unit hemi- 
sphere around this point, and summing the differential 



patch j 




Hemisphere of f 
Directions 



Solid Angle 



Patch i 

Fig, 6, The area dAj is taken to be the area of patch j that 

is visible through the solid angle ow Adapted from Wallace. 70 
Used with permission. 



82 HEWLETT-PACKARD JOUHNAL DECEMBER 1&89 



)Copr. 1949-1998 Hewlett-Packard Co. 



form factors. 

Oefta Form Factors 

To approximate the inner integral, the hemisphere of 
directions can be divided into discrete solid angles .W 
and a delta form factor can then be calculated: 



AF. 



COStl>> 



(20) 



The form factor F„ can then be approximated by summing 
the delta form factors covered when projecting patch j onto 
the unit hemisphere surrounding the center of patch i, If 
all the patches in the environment are projected onto the 
hemisphere, discarding the projections of the more distant 
patches in the case of two or more patches with overlapping 
projections, the sums of the delta form factors covered by 
these projections give the form factors trorn all patches to 
the patch represented at the center of the hemisphere, This 
procedure intrinsically includes the effects of hidden sur- 
faces. 

To make this procedure practical, a convenient means 
of dividing the surface of the hemisphere into discrete areas 
(subtended by the discrete solid angles Ata) is needed. The 
delta form factors for each of these discrete areas could be 
preca I ciliated and stored in a lookup table. An evaluation 
can then be made as to which patch projects onto a given 
discrete area. For a given patch j. the form factor calculation 
problem is reduced to determining through which of the 
discrete solid angles iku surface j is visible. Unfortunately, 
for a hemisphere, it is difficult to devise a method of creat- 
ing equal discrete areas anil b set of linear coordinates to 
describe the locations of these areas uniquely. 




Fig. 7. Areas with identical form factors Areas A, B. C D t 
and E atf have the same form factor Copyright 1985 by 
Cohen and Greenberg Used with permission. 



The Hemicube 
It would be handy if we could choose a more convenient 

surface than a hemisphere to project the patches onto. From 
the Xusselt analog it can be seen that any two patches in 
the environment that project onto the same set of discrete 
areas of the hemisphere will h same form factor 

value, Said another way. any two areas that are seen 
through • set of delta solid angles will hav< 

same form factor. In Fig. 7, E is the set of 
and A, B. C. and D all have the same form factor. Consider 
area D. If we allow D to be part of the top part of a cube 
surrounding the patch i of interest, we can determine the 
form factor from patch i to the patch with area A by calculate 
ne form factor to the patch on the cube with area D 
from patch L Thus, instead of projecting directly onto the 
unit hemisphere, we can first project onto a "hemic ube" 
and then calculate the form factor of the intermediate patch 
that has area equal to the projected area of the original 
patch. 

More specifically, an imaginary cube is constructed 
around the center of the patch i of interest (Fig. 8). The 
t-in iruiuuent is then transformed to set patch fs center at 
the origin (eye) with the patch's normal coincident with 
the positive Zaxis (assuming a left-handed coordinate sys- 
iem|. the cube is sized so that the perpendicular distance 
from the center of the patch to the surface of the cube is 
1- In this orientation, Ihe aforementioned unit hemisphere 
is surrounded by the upper- half surfaces of the cube, the 
lower half being below the horizon of the patch, One full 
face, facing in the + Z direction, and lour half faces, facing 
in the ±X and ±Y directions, replace the hemisphere, 
These faces are divided into square discrete areas [pixels) 
at some resolution, usually between 50x50 and 100 > in 
and the environment is then projected onto the five planar 
faces. 

The beauty of this scheme is that the mat hematics and 
algorithms involved in these projections are well-known: 
the same clipping, projection, and hidden surface removal 
techniques used for projection of an environment onto a 
raster display screen can be used here. | Hardware op- 
timised for these operations can also be employed I The 
view direction is set equal tn each of the + Z* + X, —X, 
+ Y t and -Y axes, and every other patch in the environ- 
ment is projected onto each of the five "screens," which 
aret lie faces of the hemicube perpendicular to each of these 
five directions. Each full lace of I he cube covers a 90 rj friis- 
torn as viewed from the center of the cube. This era 
cli pping planes of Z = X, Z = - X, Z = Y t and Z = - Y 
that can be used in a simple Sutherland-Hodgman clipper 11 
streamlined to handle 90° frustums, fvach projected patch 
can then he scan converted or raster! zed to determine 
which patch s projection covers a given pixel. If two 
lies project onto the same hemicube pixel, a Z-buffer 
algorithm can be used to decide which patch is seen in the 
discrete solid angle represented b\ the pixel. However, 
unlike the conventional Z-huffer algorithm used for image 
rendering, intensity data is not stored for each pixel, In- 
stead, the frame buffer is used as an item buffer to store 
an integer identifying the patch that is seen by the pixel 
represented by the item buffer address* 

After determining which patch j projects onto each 
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hemicube pixel, a summation of delta form factors for each 
pixel covered by patch j determines the form factor from 
patch i to patch j at the center of the hem i cube. That is, 



* = 1 ** 



[21] 



where AF is the delta form factor associated willi hemicube 



Square Pixel 




Imaginary 
Cube 



Patch J 



Patch 
Normal 




I=J 



An imaginary cube is created around the center of a patch- 
Eve ry other patch in the environment is projected onto the 
cube, 



(b) 



Fig. fl. (a) The hemicube. (b) Projection of the environment 
onto the hemicube. Copyright © 1985 by Cohen and Green- 
berg. Used with permission. 



pixel q, and R is the number of hemicube pixels covered 
by projection of patch j onto lhe hemicube surrounding 
element i. 

This summation is performed lor each patch in the envi- 
ronment to form a complete row of N form factors, Then 
the hemicube is positioned around another patch i and the 
process is repeated. 

The delta form Factor values represented by each hemi- 
cube pixel are easily calculated from the delta form factor 
equation [ZOJ and can be stored in a lookup table. Because 
oi symmetry i this table need only contain values for one 
eighth of the top face and one half of a side face of the 
hemicube (Fig. 9), 

In summary, the hemicube algorithm provides two main 
contributions. It provides a very practical method of nu- 
merically approximating the form factor integral, and pro- 
vides a method of properly accounting for the effects of hid- 
den and occluded surfaces at minimal additional expense, 

Substructuring 

The hemicube .ijjjorithni. as presented above, has some 
problems, Areas in a scene with high intensity (radiosity) 
gradients (shadow boundaries and penumbra, for example) 
may be poorly represented, particularly when the patches 
are large relative to the area over which the radiosity gra- 
dient occurs. To remedy Ibis, the areas of surfaces with 
high radiosity gradients must be subdivided into finer and 
finer grids of patches. This presents two problems; bow r to 
increase the number of patches without incurring signifi- 
( aril addition:] I computational cost, and how to decide 
which areas of the scene should be subdivided. These prob- 
lems were addressed in a paper by Cohen and others. ia 

The solution of the radiosity simultaneous equilibrium 
equations using the Gauss-Seidel technique is 0(N"), that 
is, the number of calculations required is of order N z t where 
N is the number of patches used to describe the scene. The 
calculation of the form factors is also O(N^). If the first 
problem is not addressed and N is naively increased, the 
computational costs can be prohibitive. 

To remedy this situation we borrow a concept from en- 
gineering mechanics known as substructuring, where the 
solution for local stress behavior is based on the global 
structure response to a coarse solution. Applying this no- 
lit m to the radiosity problem, we subdivide the patches 
that are too large into a total of M elements (according to 
criteria to be discussed later], leaving K unsubdivided 
patches. It is assumed that each element has a constant 
radiosity, but that these element radiosities vary across the 
patch. Next, we would like to be able to find the radiosities 
of the elements using a solution for the radiosities of the 
original patches and avoiding a full 0((M-hK] 2 ) solution, 
somehow applying the solution for the global patch 
radiosities to the elements. 

Element Radiosities 

To see how to do this, assume that patch i has been 
subdivided into R elements. We can then represent B,, the 
radiosity of patch i, as the average over the area of the 
patch of the element radiosities; 
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From the definition of ihe radiosity of an element £ 
in equation 6 we know that 



B q = E, ; 1 



Substituting equation 23 into equation 22, we have 



(23) 



1 



Distributing, we have 



H K 



"•* ^,?i M* + il; „?. ( p, < $ HF ■• A i ) . (24b > 

If we assume the emission and reflectivity of the palt:h 
are constant, then E t - E q and p x = p iy Also, ii the global 
radiosities B^ are assumed constant for each element over 
the surface of each patch, we have 

*-%& 55 A -i + a 5 B * (i?-. F - /V < ) (24c) 

and 
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Fig. 9, Derivation of delta form factors. Copyright $ f 985 by 
Cohen and Greenberg Used with permission 
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By comparing equation f». I lie quantity in parentheses 
above i n equation 24d i s eas i 1 y seen to be the patch -to- pa t < 1 1 
form factor expressed as (he area-weighted average of the 
elemenL-to-patch form factors, where the elements are sub* 
divisions of patch i. Thus 



1 



~k-£ 



**. F n iA n 



(25) 



Each of I he element- to- patch form factors F q j can be fan nd 
using the hernicube algorithm. Then the patch-to-patch 
form factors F« are calculated using equation 25. The stan- 
dard system of simultaneous radiosity equations (7c) can 
then be solved to yield the patch radiosities in 0[N 2 ) time. 
J 1m resulting patch radiosilies are mo re ace urate than those 
that would have been obtained without subdividing the 
patches into elements. I I. is l>ecause the expression for 
F,i given by equation 25 represents a discrete numerical 
method for approximating the outer area integral of the 
form factoi double area integral given in equation 13. Recall 
that in the original hernicube algorithm tin: outer int- 
was taken to be unity because we restricted the patches to 
he small relative to the distances that separate them. We 
now can remove that restriction by using equation 25, but 
placing the same restriction on the elements that make up 
a patch. 
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Once the patch radiosities have been calculated, the 
radiosity for each element q can be found using the basic 
equation fur the radiosity of an element, equation 23. 

In short, subdivision of patches into dements and use 
of the above equations provides two main advantages over 
naive use of a finer patch resolution. First, the local vari- 
ations of Intensity within a patch can be accurately approx- 
imated without hawing to solve the global radiosity equa- 
tions on an element level. Second, the radiosity solution 
on the patch level is more act urate because it better approx- 
imates the patch -to -pa tch form factors, 

Substructuring Algorithm 

The fo] lowing are the major steps involved in employing 
these optimisations in a rendering algorithm: 

1. Form a hierarchical description of the environment con- 
sisting of surfaces, subsurfaces, patches, and elements. 

2. For each element, find the form factor to each patch 
using the hemicube algorithm, and store the results in 
an M > N matrix (M = number of elements, N = number 
of patches]. 

3. Compress this matrix into an NxN patch form factor 
matrix using equation 25, 

4. For each of the primary color hands — red* green, and 
blue — form and solve the set of N equations in N un- 
knowns for the patch radiosities using the Gauss-Seidel 
iterative technique and equation 7c, 

5. Compute the M element radiosities for each element q 
using equation 23, the patch radiosities, and the ele- 
ment -to- patch form factors computed in step 2, 

6. Calculate element vertex radiosities from the radiosities 
of the elements adjacent to the vertex. 

7. Linearly interpolate the vertex radiosities across the ele- 
ments using Gouraud shading or object-space interpola- 
tion. 

Adaptive Subdivision 

Wo now address the criteria to be used in deciding how 
to partition the scene hierarchically down to the element 
level. Ideally, the element mesh should be densest in re- 
gions of high intensity gradients, Cohen's paper 12 says that 
a reasonable first guess must be provided by the user as to 
which areas are likely to have high intensity gradients. 
These areas include areas in shadow and areas near light 
sources. Then the intensities of adjacent vertices found in 
step 6 can be compared. If the change in intensity is greater 
than some threshold value, the elements adjacent to that 
vertex should be recursively subdivided until the intensity 
change is below the threshold. The algorithm is then recur- 
sively repeated, beginning at step 2. 

Cohen used a simple binary subdivision, where each 
rectangular element is divided into four new elements. 
This preserves the original patch's geometry and allows 
most of the previously computed element-to-patob form 
factors to be reused. The only change that needs to be made 
to the original MxN element-to-patch form factor matrix 
to subdivide a particular element i is to remove row i from 
the matrix and insert four new element rows. (Of course, 
the hemicube algorithm must be used to calculate the ele- 
ments of the new rows). This object-space subdivision 
technique is analogous to the Warnock algorithm. 13 which 
subdivides polygons to perform hidden surface removal. 



Progressive Refinement 

Perhaps the most significant improvement to the radica- 
lly method is the algorithm based on the technique of pro- 
gressive refinement devised by Cohen and his colleagues. 14 
This algorithm has two main advantages over those we 
have described so far, 

First, it provides renderings of the environment that are 
early approximations cil the Una I energy-equilibrium solu- 
tion, This has the advantage of allowing the user to see 
advance previews approximating the final correctly ren- 
dered scene without having to wait for the full 0{N 2 ] solu- 
tion to equation 7c. At each step of the progressive refine- 
ment approach, the rendering of the scene gracefully con- 
verges to the full solution. The user can interactively stop 
this progression when the rendering looks good enoogh. 
In most cases, a useful image is produced in O(N) time. 

The second advantage of the progressive refinement ap- 
proach is a reduction in storage and start-up computational 
costs, The previous algorithms require that all form factors 
be precaiculated before the Gauss-Seidel solution begins. 
This requires 0(N 2 ) storage. For reasonably complex envi- 
ronments, this cost can be significant, For example, an 
environment of 5G\000 patches will require a gigabyte of 
storage. 1 ' 1 In the progressive refinement algorithm, form 
factors are calculated on the fly !o redoce the form factor 
storage requirements to O(N) and eliminate the associated 
startup computational costs. 

The progressive refinement algorithm can be thought of 
as a restructuring of previous methods, and differs from 
them primarily in two ways. First the radiosity of all 
patches is updated simultaneously instead of one at a time 
during each iteration. Second, patches are processed in 
sorted order according to their energy contribution to the 
environment. 

To gain an insight into how this is possible* consider 
row i of equation 7c [i.e., equation H). This equation may 
be thought of as one that determines the light leaving patch 
s by gathering in the light from the rest of the environment 
(Fig. 10). 

A single term from the summation in equation 6 deter- 
mines the contribution of patch j to the radiosity of patch 
i, that is. 



Contribution of B, to ABj = ^BlFj. 



[26) 



The progressive refinement method reverses this process 
by considering the contribution made by patch i to the 
radiosity of all other patches. The reciprocity relationship 
(equation 4) provides the basis for this reversal. The con- 
tribution of the radiosity from patch i to the radiosity of 
patch j is 



Contribution of Bi to ABj = p-fl.F^ 



(27) 



The total contribution to the environment from the 
radiosity of patch i is determined by calculating the above 
equation for all patches j. 

A key fact about this reformulation is that the radiosities 
of the patches j in the environment are updated using form 
factors calculated via a single hemicube placed at patch i. 
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Thus, each step of the iteration no longer requires that all 
of the form factors F l} be known in advance. Each step of 
the solution now consists of placing a single hexnicube 
around a patch i and adding the contribution from the 
radiosity of that patch to the radiosities of all other patches, 
calculating form factors as needed. In effect, we are shoot- 
ing light from patch i out into the environment rather than 
gathering the light from the environment received at patch 
i [Fig + 10), For a more detailed description of this iterative 
shooting algorithm, consult the literature. 1 ^ 

To arrive at the final solution as quickly as possible, we 
capitalize on the fact that if the patches I with the largest 
contribution to the environment are processed first, the 
final value for the radiosity of patch j, which is the sum 
of these contributions, will be approached earlier. Slated 
intuitively, those patches radiating the most light energy- 
should be treated first, since they have the greatest effect 
on the illumination of environment. This energy will tend 
to come from those patches having the largest product B 

Accordingly, the progressive refinement algorithm is im- 
plemented by always shooting first from patches for which 
the difference ABjA, between the previous and current es- 
timates of unshot radiosity is greatest. This usually results 
in most light sources being processed first, followed by the 
patches that receive the most light from the light sources, 
and so on. Thus, when solving in sorted order, the solution 
tends to proceed in nearly the .same order as light would 
propagate through the environment. Solving in sorted order 
usually yields a useful estimate of the final solution in less 
than a single full iteration, substantially reducing compu- 
j.iHmm i ust.v ' ' Pigi 2 was rendered using a progressive re- 
finemeni technique. 

Summary and Conclusions 

The radiosity global illumination method combats many 
of the deficiencies of ray tracing. Radiosity methods p re- 
duce excellent penumbras, shadow envelopes, and color 



bleeding effects. Area light sources axe accurately modeled. 
The radiosity model is "correct " in the sense that it is based 
on laws of physics (energy equilibrium because of conser- 
vation of energy). In the radiosity method, illumination 
modeling is decoupled from scan conversion and render- 
Finally. - - rithms are view independent. 
This allows a high degree of interactivity for static geometry 
once the preprocessing is complete. 

Despite these advantages, radiosity also has some disad- 
vantages with respect to ray tracing. Rendering using the 
full radiosity solution is slow (although proponents claim 
r than ray tracing because it is view independent). 
Also, specular reflections, transparency, and trans! ucency 
are not modeled. 

Radiosity and ray tracing are complementary methods. 
No one method models reality perfectly (although radiosity 
advocates point out that most natural environments are 
predominantly diffuse], Recent research involves combin- 
ing aspects of both methods, For example, in a very recent 
paper by Wallace and his colleagues. 17 a ray tracing 
technique is used to compute form factors, instead of the 
hemmube algorithm. Also, techniques have been recently 
proposed for producing specular highlights along with 
global diffuse illumination components. ! ' 

There is still a fair amount of research that needs to be 
done before an interactive global model can be offered that 
models an environment perfectly without having to sac- 
rifice diffuse components, as in ray tracing, or specular 
highlights, as in radiosity. However, the illumination mod- 
els that have been developed to date are extremely useful 
and should be made available to users of graphics worksta- 
tions- Accordingly, Hewlett-Packard chose to become the 
first workstation vendor to offer radiosity-based illumina- 
tion modeling as well as the more traditional methods. In 
July 1989, HP released its Starbase Radiosity and Ray Trac- 
ing software, which integrates into thr Starbase display list 
support for both radiosity and ray Inn in** on high-end Tur- 
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Fig. 10. Gathering versus shoot- 
ing. Gathering light through a 
hemtcub e a tio WB one patch 
radiosity to be updated Shooting 
hght through a single hemicube af- 
fows the entire environment's 
radiosity values to be updated 
ineousfy Copyrights 1989 
by Cohen. Chan, Wallace, and 
G reenberg Us ed with per mis sion , 
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KX workstations. This gives the graphics programmer 
a choice of three illumination methods; local illumination 
based on an enhanced Phong model 1 global illumination 
based on anti-aliased ray tracing, and global illumination 
using progressive refinement radio sity. Applications using 
the Starbase display list can now be written to provide the 
user with the widest possible variety of photorealistic ren- 
dering, 

We have presented a tutorial summary of the theory and 
algorithms of the radios! ty me! hod that have appeared in 
the literature over the last few years. We have done so with 
the hope that the reader will gain an ioluilive feel for the 
method, some of (he improvements that have been made 
to it, and the advantages that may be gained from it. No 
attempt has been made to discuss the particulars of HP's 
implementation, which makes use of the most recenl ad- 

U 17 

varices. ' 
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